[PDF] Random Grammar-based Testing for Covering All Non-Terminals

Abstract

In the context of software testing, generating complex data inputs is frequently performed using a grammar-based specification. For combinatorial reasons, an exhaustive generation of the data -- of a given size -- is practically impossible, and most approaches are either based on random techniques or on coverage criteria. In this paper, we show how to combine these two techniques by biasing the random generation in order to optimise the probability of satisfying a coverage criterion.

Full PDF

aa r X i v : . [ c s . S E ] N ov Random Grammar-based Testing forCovering All Non-Terminals

Alo¨ıs Dreyfus, Pierre-Cyrille H´eam and Olga KouchnarenkoFEMTO-ST - CNRS UMR6174 - Universit´e de Franche-Comt´e - INRIA CASSIS16 route de Gray - 25030 Besanc¸on, FranceEmail: ﬁ[email protected]

Abstract —In the context of software testing, generating com-plex data inputs is frequently performed using a grammar-basedspeciﬁcation. For combinatorial reasons, an exhaustive generationof the data – of a given size – is practically impossible, andmost approaches are either based on random techniques or oncoverage criteria. In this paper, we show how to combine thesetwo techniques by biasing the random generation in order tooptimise the probability of satisfying a coverage criterion.

Keywords -Random testing, Grammar-based testing.

I. I

NTRODUCTION

A. Motivation

Producing trusted software is a central issue in softwareengineering. Testing remains an inescapable step to ensuresoftware quality. In reaction to the limitations of manualtesting, recent years have seen a rise in the research interestfor systematic testing frameworks grounded in theory. Randomtesting is a natural approach, empirically known to detectmany kinds of bugs. However, by deﬁnition, low-probabilitybehaviours cannot be adequately tested in that way. Con-versely, non-random testing tends to focus on a few edgecases of particular interest to the tester, at the expense of allothers. Indeed, it can cover various behaviours, but their choicedepends on tester’s priorities and, in general, each behaviouris tested in a unique way.In [1], it is explained how to bias a uniform random testingapproach using constraints given by a coverage criterion, inorder to optimise the probability of satisfying this criterion.The technique is developed for path generation in a graph.The contribution of the present paper consists in enriching thisapproach with a coverage criterion on non-terminal symbols ofthe grammar, allowing the user to apply it to grammar-basedtesting.

B. Related Work

Grammar-based testing is frequently used for generatingstructured inputs, as in [2] for parser testing or in [3] totest refactoring engines (program transformation software).Systematic combinatorial approaches [4] lead to a huge num-ber of sequences, and symbolic approaches are frequentlypreferred [5], [6], [7]. In [8], a generic tool for generatingtest data from grammars has been proposed. This tool doesnot provide any random feature but is based on rule coveragealgorithms and techniques, as deﬁned in [2], [9], [10], [11]. Random test generation techniques – initially proposedin [12], [13] – are frequently used for practical reasons, asin [14], [15], [1]. Combining random generation and grammar-based testing is explored in [16], [17], [18], [19], [20], [21],without exploiting any coverage criteria, or using an isotropicrandom walk as in [22].

C. Layout

Section II presents the notions and notations used in thispaper. Section III explains how to optimise random testing tosatisfy a given coverage criterion. The theoretical contributionsare provided in Section IV, which shows how to use thistechnique to optimise the coverage of non-terminal symbolsin a grammar-based testing context. An illustrating example isdeveloped in Section V. Finally, Section VI concludes.II. F

ORMAL B ACKGROUND

A. Context-free Grammars and Random Generation

In this paper, the cardinality of a ﬁnite set S is denoted |S| . a) Context-free Grammars: A context-free grammar isa tuple G = (Σ , Γ , S , R ) , where Σ and Γ are disjoint ﬁnitealphabets, S ∈ Γ is the initial symbol, and R is a ﬁnite subsetof Γ × (Σ ∪ Γ) ∗ . The elements of Σ are called terminal symbols ,and the elements of Γ are called non-terminal symbols . Anelement ( X, u ) of R is called a rule of the grammar andis frequently denoted X → u . A word w ∈ (Σ ∪ Γ) ∗ is a successor of v ∈ (Σ ∪ Γ) ∗ for the grammar G if there exist v ∈ Σ ∗ , v , v ∈ (Σ ∪ Γ) ∗ , S ∈ Γ such that v = v Sv and w = v v v and S → v ∈ R . A complete derivation of the grammar G is a ﬁnite sequence x , . . . , x k of words of (Σ ∪ Γ) ∗ such that x = S , x k ∈ Σ ∗ and for every i , x i +1 isa successor of x i . A derivation tree of G is a ﬁnite tree whoseinternal nodes are labelled by letters of Γ , whose leaves arelabelled by elements of Σ ∪ { ε } , whose root is labelled by S and satisfying: if a node is labelled by X ∈ Γ and its childrenare labelled by α , . . . , α k (in this order), then either α = ε and k = 1 , or all the α i ’s are in Γ ∪ Σ and ( X, α . . . α k ) ∈ R .The size of a derivation tree is given by the number of treenodes. Example 1 – Context-free grammar.

Let us consider thegrammar G = ( { a, b } , { S, T } , S, R ) , with R = { S → As v ∈ Σ ∗ , this derivation is obviously a left-most derivation. b, S → aSb, T → ε } ) . The sequence S, aSb, aT bb, abb is acomplete derivation of the grammar. The associated derivationtree is

S Sab bT ε

Note that there is a bijection between the set of completederivations of a grammar and the set of derivation trees of thisgrammar. For a context-free grammar G , E n ( G ) denotes thenumber of derivation trees of G with n nodes. A derivation tree covers an element X of Γ if at least one of its nodes is labelledby X . For instance, for the tree in Example 1, the elements S and T are covered since they appear in the derivation tree. b) Uniform Random Generation: The present issue is,given a positive integer and a context-free grammar, to com-pute randomly with a uniform distribution a derivation treeof size n of this grammar. We will brieﬂy explain herehow to tackle this problem by using well-known countingtechniques [23]. Notice that more advanced techniques allowa faster computation, like in [24].As usual, the non-terminals symbols are denoted by capitalletters. Given a context-free grammar G = (Σ , Γ , S , R ) , anon-terminal symbol X in Γ , and a positive integer i , the num-ber of derivation trees of size i generated by (Σ , Γ , X, R ) isdenoted by x ( i ) , i.e., using the corresponding lowercase letter.Given a positive integer n , for each symbol S ∈ Γ , thesequence of positive integers s (1) , . . . , s ( k ) , . . . is introduced.The recursive computation of these s ( i ) ’s is as follows. Foreach strictly positive integer k and each rule r = ( S, w S . . . w n S n w n +1 ) ∈ R , with w j ∈ Σ ∗ and S i ∈ Γ , let us set  β r = 1 + P n +1 i =1 | w i | α r ( k ) = P i + i + ... + i n = k − β r Q j = nj =1 s j ( i j ) if n = 0 α r ( k ) = 0 if n = 0 and k = β r α r ( β r ) = 1 if n = 0 . It is known [23, Theorem I.1] that s ( k ) = P r ∈ R ∩ ( S × (Σ ∪ Γ) ∗ ) α r ( k ) . Since, by hypothesis, there is no rule of the form ( S, T ) in R , with S, T ∈ Γ , all i j ’s involved in the deﬁnitionof β r are strictly less than k . This way, the s ( i ) ’s can berecursively computed. Consider for instance the grammar ( { a, b } , { X } , X, { r , r , r } ) with r = ( X, XX ) r =( X, a ) and r = ( X, b ) . One has β r = 1 + 0 = 1 , β r = 1 +1 = 2 , β r = 1+1 = 2 . Therefore x ( k ) = P i + j = k − x ( i ) x ( j ) if k = 2 , and x (2) = 1 + 1 + P i + j =2 − x ( i ) x ( j ) = 2 , other-wise. It follows that x (1) = 0 , x (2) = 2 , x (3) = x (1) x (1) =0 , x (4) = x (1) x (2) + x (2) x (1) = 0 , x (5) = x (2) x (2) = 4 ,etc. The two derivation trees of size 2 are X | a and X | b . The fourderivation trees of size 5 are the trees of the form X/ \ Z Z whereboth Z and Z are derivation trees of size 2.In order to generate derivation trees of size n , all s ( i ) ′ s ,for S ∈ Γ and i ≤ n , have to be computed with the Random GenerationInput: G = (Σ , Γ , X, R ) a context-free grammar, n a strictlypositive integer. Output: a derivation tree t of G size n . Algorithm:

1. Let { r , r , . . . , r ℓ } be set of the elements of R whoseﬁrst element is X .2. If P j = ℓj =1 α r j ( n ) = 0 , then return “Exception”.3. Pick i ∈ { , . . . , ℓ } with probability P rob ( i = j ) = α ri ( n ) P j = ℓj =1 α rj ( n ) .

4. Let r i = ( X, Z . . . Z k ) , with Z j ∈ Σ ∪ Γ .5. Root symbol of t is X .6. Children of t are Z , . . . , Z k in this order.7. Let { i , . . . , i m } = { j | Z j ∈ Γ } . Pick ( x , . . . , x m ) ∈ N m such that x + . . . + x m = n − β r i with probability P rob ( x = ℓ , . . . , x m = ℓ m ) = Q j = mj =1 z i j ( ℓ j ) α r i ( n ) .

9. For each i j , the i j -th sub-tree of T is obtained byrunning the Random Generation algorithm on (Σ , Γ , Z i j , R ) and ℓ j .10. Return t . Fig. 1. Random Generation algorithm above method. This can be performed in polynomial time.Afterwards, the random generation is done recursively usingthe given algorithm in Fig. 1.It is known [23] that this algorithm provides a uniformgeneration of derivation trees of size n , i.e. each derivationtree occurs with the same probability. Note that an exception israised at Step 2 if there is no element of the given size. For theexample presented before, there is no element of size 3, thenit is impossible to generate a derivation tree of size 3. Runningthe algorithm on this example with n = 2 , one considers atStep 1 the set { r , r , r } since all these rules have X asleft element. Since α r (2) = 0 , α r (2) = 1 , α r (2) = 1 ,at Step 3 the probability that i = 1 is 0, the probabilitythat i = 2 is / and the probability that i = 3 is / . If i = 2 is picked, the generated tree has X as root symboland a as unique child. Running the algorithm on this examplewith n = 3 stops at Step 2 since there is no tree of size .When running the algorithm on this example with n = 5 , theset { r , r , r } is considered at Step 1. Since α r (5) = 4 , α r (5) = 0 , α r (5) = 0 , i = 1 is picked with probability 1.Therefore, the tree has X as root symbol, and its two childrenare both labelled by X . Therefore, at Step 7, the consideredset is { , } . At Step 8, one has n − β r = 5 − . Theprobability that i = 1 and i = 3 is 0 since x (1) = 0 .Similarly, the probability that i = 3 and i = 1 is 0 too.Now the probability that i = 2 and i = 2 is 1. Afterwardsthe algorithm is recursively executed on each child with n = 2 :ach of the 4 trees is chosen with probability / .III. M IXING R ANDOM T ESTING AND C OVERAGE C RITERIA

In a context of functional testing, the strength of randomtesting is to quickly provide many different test data, foreach behaviour of the system. Moreover, these test data areindependent of the choices of the test designer, and con-sequently they can catch problem (s)he did not anticipate.For instance, fuzz testing is particularly relevant for testingsecurity requirements [7]. However, random testing can missan important behaviour occurring with a very small prob-ability. To exploit the advantages of both random testingand deterministic testing, a solution is to combine randomgeneration and coverage criteria.The general schema for this combination, as describedin [1], is the following: considering a random generationalgorithm of test data of size n and a coverage criteria C (each element of C is or is not covered by each possibletest), the goal is to use the generation algorithm N timesin order to optimise the probability of covering all elementsof C . For each element e ∈ C , we denote by p e,n theprobability that a generated test of size n covers e . One caneasily check that generating N test data independently of C provides a probability of covering C of − (1 − p min ) N , where p min = min e ∈ C { p e,n } . This probability is the way to measurethe quality of the testing approach, relatively to C . A betterway is to repeat N times the following procedure:1) Pick at random an element e ∈ C with a probability π e ,and2) Generate uniformly a test of size n covering e .This procedure requires to know how to uniformly generatea test of size n covering a given element, and to choose theprobabilities π e ’s to optimise the probability of covering allelements of C .Following [1], the optimisation requires solving the follow-ing constraint system: maximise p satisfying ( p ≤ P e ∈ C π e p e,f,n p e,n for all f ∈ C P e ∈ C π e = 1 where p e,f,n is the probability that a randomly generated testof size n covers both e and f . This linear programmingproblem can be solved in an efﬁcient way, using simplex-likeapproaches.In summary, in order to combine random testing and acoverage criterion, it is required to solve a constraint systemand to know 1) how to randomly generate a test of a givensize covering a given element, 2) how to compute the p e,n ’s;and 3) how to compute the p e,f,n ’s.The rest of the paper is dedicated to the problem of therandom generation of execution trees of a grammar, with thecoverage criterion All non-terminal symbols . More precisely,given a grammar G = (Σ , Γ , S , R ) , the coverage criterionbeing Γ , a test of size n being a derivation tree of G of size n , we say that X ∈ Γ is covered by a test if the derivationtree covers X . IV. C OMPUTING p X,n

AND p X,Y,n

In this section G = (Σ , Γ , S , R ) is a context-free grammar.We denote by E n ( G ) the set of derivation trees of size n of G . We respectively denote by E X,n ( G ) and E X,Y,n ( G ) the setof derivation trees of size n of G covering X , and coveringboth X and Y .Let p X,n be the probability of a randomly generated deriva-tion tree of size n to cover X . Clearly, if E n ( G ) is empty then p X,n = 0 [resp. p X,Y,n = 0 ]. Otherwise, p X,n = | E X,n ( G ) || E n ( G ) | [resp. p X,Y,n = | E X,Y,n ( G ) || E n ( G ) | ].Therefore, computing the probability deﬁned in Section III– needed to solve the linear constraint program – reducesto the computation of the cardinality of sets E X,n ( G ) and E X,Y,n ( G ) . A. Computing | E X,n ( G ) | and | E X,Y,n ( G ) | To compute | E X,n ( G ) | , we build a grammar G X such that E n ( G X ) and E X,n ( G ) are in bijection (and therefore have thesame number of elements).For every w ∈ (Γ ∪ Σ) ∗ , [ w ] is recursively deﬁned by: [ ε ] = ε , [ Zw ] = ( Z, w ] (with Z ∈ Γ ) and [ aw ] = a [ w ] (with a ∈ Σ ). Intuitively, [ w ] is obtained from w by changingeach letter of w in Γ by the corresponding pair with assecond element. For instance, with the grammar of Example 1,one has [ aSbbT ] = a ( S, bb ( T, . For every w ∈ (Γ ∪ Σ) ∗ , [ w ] is deﬁned exactly in the same way, changing all ’s by ’s.For every w ∈ (Γ ∪ Σ) ∗ , { w } , is deﬁned as the set of words w ′ ∈ (Σ ∪ Γ × { , } ) ∗ obtained from w by replacing occur-rence of each letter Z of Γ either by ( Z, or by ( Z, , withthe restriction that at least one is replaced by ( Z, . The lettersin Σ remain unchanged. For instance, if w = aSbT , then { w } , = { a ( S, b ( T, , a ( S, b ( T, , a ( S, b ( T, } . No-tice that if w ∈ Σ ∗ then { w } , = ∅ since the constraint isnot satisﬁed.Let G X = (Σ , Γ ×{ , , } , ( S , , R X ) where R X = R ∪ R ∪ R ′ ∪ R with: • R = { ( Z, → [ w ] | Z → w ∈ R } , • R = { ( Z, → w ′ | Z = X and ∃ Z → w ∈ R such that w ′ ∈ { w } , } , • R ′ = { ( X, → [ w ] | X → w ∈ R } , • R = { ( Z, → [ w ] | Z → w ∈ R and Z = X } .Intuitively, adding the value to a symbol in Γ means thatif this rule is used, there exists an occurrence of X at anupper position in the derivation tree. Adding the value toa symbol in Γ means that there is no occurrence of X at anupper position, but there exists an occurrence of X at this or alower position in the derivation tree. The value means thereis no occurrence of X appearing in the tree at an upper orlower position. Example 2 – G X . Consider the grammar G = ( { a, b } , { S, T, X } , S, R ) with R = { S → SS, S → aT, S → Xb, T → aa, X → T X, X → b } .The grammar G X has the set of rules as follows: { ( S, → ( S, S, , ( S, → a ( T, , ( S, → X, b, ( T, → aa, ( X, → b, ( X, → ( T, X, } ∪{ ( S, → ( S, S, , ( S, → ( S, S, , ( S, → ( S, S, , ( S, → a ( T, , ( S, → ( X, b } ∪ { ( X, → b, ( X, → ( T, X, } ∪ { ( S, → ( S, S, , ( S, → a ( T, , ( S, → ( X, b, ( T, → aa } . Proposition 1 – Bijection.

There exists a bijection between E n ( G X ) and E X,n ( G ) . Example 3 illustrates several elements of the following proof.

Proof:

Let ϕ be the function from (Γ × { , , } ∪ Σ) ∗ into (Γ ∪ Σ) inductively deﬁned by: ϕ ( ε ) = ε and ϕ ( aw ) = aϕ ( w ) if a ∈ Σ ∪ Γ and ϕ (( Z, α ) w ) = Zϕ ( w ) if Z ∈ Γ and α ∈ { , , } . Intuitively, ϕ is a projection deleting thecomponents in { , , } .By construction of G X , if ( Z, α ) → w is a rule of G X then ϕ (( Z, α )) → ϕ ( w ) is a rule of G . Therefore, if x , . . . , x k is complete derivation of G X , then ϕ ( x ) , . . . , ϕ ( x k ) is acomplete derivation of G . Moreover, the initial symbol of G X is ( S, and all rules of R X with a left hand side in (Γ \ { X } ) × { } have a right hand side where an element of Γ × { } occurs. Therefore, since x k ∈ Σ ∗ , the only way todestroy the component is to use a rule with the left handside ( X, . It follows that the derivation tree associated to ϕ ( x ) , . . . , ϕ ( x k ) covers X .Consequently, ϕ induces a function from E n ( G X ) into E X,n ( G ) . Let x , . . . , x k and x ′ , . . . , x ′ k be complete deriva-tions of G X , such that ϕ ( x ) , . . . , ϕ ( x k ) = ϕ ( x ′ ) , . . . , ϕ ( x ′ k ) .Assuming that x , . . . , x k = x ′ , . . . , x ′ k , there exists a minimalindex i such that x i = x ′ i . Since x = ( S ,

1) = x ′ , i ≥ .Therefore x i − = x ′ i − exists. Set x i − = v ( Z, α ) v , with Z ∈ Γ and α ∈ { , , } . One of the following cases arises: • If α = 0 , then there exist Z → w and Z → w ′ in R such that x i = v [ w ] v and x ′ i = v [ w ′ ] v . Since ϕ ( x i ) = ϕ ( x ′ i ) , it follows that ϕ ([ w ] ) = ϕ ([ w ′ ] ) . But ϕ ([ w ] ) = w and ϕ ([ w ′ ] ) = w ′ , proving that x i = x ′ i ,a contradiction. • If α = 2 , then the same proof holds, replacing by . • If α = 1 and Z = X , then, again, the same proof holds. • If α = 1 and Z = X , then there exist Z → w and Z → w ′ in R such that x i = v w v and x ′ i = v w v , with w ∈ { w } , and w ∈ { w ′ } , . Since ϕ ( x i ) = ϕ ( x ′ i ) ,one has w = w ′ . Therefore, w , w ∈ { w } , . Since w = w , let j be the ﬁrst letter of w which is differentfrom the corresponding letter in w . By construction of { w } , , this letter must be in Γ × { , } in both w and w , for instance ( T, β ) and ( H, β ) . Now, since ϕ (( T, β )) = ϕ (( H, β )) , one has T = H . Therefore,without loss of generality we may assume that β = 1 and β = 2 . Consequently, x i has a preﬁx of theform v ( T, : in the derivation tree corresponding to x , . . . , x k , the subtree rooted in this ( T, contains an X (by construction of R ). Conversely, x ′ i has a preﬁxof the form v ( T, : in the derivation tree correspondingto x ′ , . . . , x ′ k , the subtree rooted in this ( T, does notcontain any X (by construction of R ). It follows thatthe two corresponding derivations cannot have the same SS SS Sa Ta a X bb bXT Xa a b

Fig. 2. Derivation tree of G - Example 3 image by ϕ , a contradiction.It follows that ϕ induces an injective function from E n ( G X ) into E X,n ( G ) .Now let y , . . . , y k be complete derivations of G whosecorresponding tree t is in E X,n ( G ) . We consider the tree t ′ labelled in Γ × { , , } ∪ Σ which has exactly the samestructure (the same set of positions) than t and such that: • If a node of t is labelled by a letter of Σ , then thecorresponding node in t ′ has the same label. • If a node ρ of t is labelled by a letter T ∈ Γ , then thenode ρ in t ′ is labelled by ( T, if there is no X on thepath from the root to ρ (excluding ρ ), and if the subtreerooted in ρ (including ρ ) contains one ρ , at least. It islabelled by ( T, if there is at least one X on the pathfrom the root to ρ . Otherwise, it is labelled by ( T, .One can check that t ′ corresponds to a complete derivationtree of G X whose image by ϕ is exactly the completeexecution corresponding to t , proving that ϕ is surjective,which concludes the proof. Example 3 – Illustration of the proof of Prop. 1.

Considerthe grammar G = ( { a, b } , { S, T, X } , S, R ) with R = { S → SS, S → aT, S → Xb, T → aa, X, → T, X → b } of Example 2. Consider the derivation tree of E X, ( G ) depicted in Fig. 2, corresponding to the complete derivation S, SS, SSS, aT SS, aaaSS, aaaXbS, aaabbS, aaabbXb,aaabbT Xb, aaabbaaXb, aaabbaabb . The associatedderivation in G X is ( S, , ( S, S, , ( S, S, S, ,a ( T, S, S, , aaa ( S, S, , aaa ( X, b ( S, ,aaabb ( S, , aaabb ( X, b, aaabb ( T, X, b, aaabbaa ( X, b,aaabbaabb , whose derivation tree from E ( G X ) is depictedin Fig. 3. Using Proposition 1 and the results described in Section I,it is possible to compute | E X,n ( G ) | . If we denote by ℓ themaximal number of elements of Γ (with multiplicity) occurringin a right hand side of G , then G X has O (2 ℓ | R | ) rules, whosesizes are bounded by the maximal size of the rules of G .Therefore if ℓ is reasonable, the computation of | E X,n ( G ) | is tractable in practice, even for a quite large value of n . Asmentioned above, the computation of | E X,n ( G ) | immediatelyprovides p X,n . It is also important to point out that G X allows S, S,

1) ( S, S,

2) ( S, a ( T, a a ( X, bb b ( X, T,

0) ( X, a a b Fig. 3. Derivation tree of G X - Example 3 the uniform random computation of execution trees of G of agiven size and covering X .Since E X,X,n ( G ) = E X,n ( G ) , computing | E X,X,n ( G ) | is a direct application of the above techniques. Computing | E X,Y,n ( G ) | , with Y = X , can almost be done by a similarconstruction: the difference is that the construction of the rulesof the grammar G XY , from the grammar G X , must take intoaccount that both X and Y have to appear in the derivation.Let G XY = (Σ , Γ × { , , } × { , , } , (( S , , , R XY ) where R XY = R ∪ R ∪ R ′ ∪ R with: • R = { (( Z, i ) , → [ w ] | ( Z, i ) → w ∈ R X } , • R = { (( Z, i ) , → w ′ | Z = Y and ∃ ( Z, i ) → w ∈ R X such that w ′ ∈ { w } , } , • R ′ = { (( Y, i ) , → [ w ] | ( Y, i ) → w ∈ R X } , • R = { (( Z, i ) , → [ w ] | ( Z, i ) → w ∈ R X and Z = Y } .A proof similar to the one of Proposition 1 allows showingthat there is a computable bijection between E n ( G XY ) and E X,Y,n ( G ) . Note that the size of G XY is approximatively ℓ times greater than the size of G .V. E XPERIMENTS

The approach has been evaluated on a simpliﬁed version ofthe grammar of JSON (for JavaScript Object Notation) – alanguage independent common format for declaring objects.Formally, let us consider the grammar G = (Σ , Γ , Object, R ) with Σ having the eight following elements Σ = { , , { , : , } , letter, digit, [ , ] } . The set Γ of non-terminal symbols iscomposed of the elements ′′ Object ′′ , ′′ M embers ′′ , ′′ P air ′′ , ′′ Array ′′ , ′′ Elements ′′ and ′′ V alue ′′ . Finally, the set R contains the following rules: • Object → {} | {

M embers } • M embers → P air | P air, M embers • P air → letter : V alue • Array → [ ] | [ Elements ] • Elements → V alue | V alue, Elements • V alue → letter | Object | digit | Array To provide a more readable speciﬁcation, the convention consisting inusing capital letters for non-terminal symbols is not entirely respected here.

In order to optimise the coverage criterion, we have to solvethe following system while maximising p satisfying  p ≤ π Object p

Object,Object,n p Object,n + π Members p

Members,Object,n p Members,n + π P air p

Pair,Object,n p Pair,n + π Array p

Array,Object,n p Array,n + π Elements p

Elements,Object,n p Elements,n + π V alue p

V alue,Object,n p V alue,n p ≤ π Object p

Object,Members,n p Object,n + π Members p

Members,Members,n p Members,n + π P air p

Pair,Members,n p Pair,n + π Array p

Array,Members,n p Array,n + π Elements p

Elements,Members,n p Elements,n + π V alue p

V alue,Members,n p V alue,n p ≤ π Object p

Object,P air,n p Object,n + π Members p

Members,P air,n p Members,n + π P air p

Pair,Pair,n p Pair,n + π Array p

Array,P air,n p Array,n + π Elements p

Elements,Pair,n p Elements,n + π V alue p

V alue,Pair,n p V alue,n p ≤ π Object p

Object,Array,n p Object,n + π Members p

Members,Array,n p Members,n + π P air p

Pair,Array,n p Pair,n + π Array p

Array,Array,n p Array,n + π Elements p

Elements,Array,n p Elements,n + π V alue p

V alue,Array,n p V alue,n p ≤ π Object p

Object,Elements,n p Object,n + π Members p

Members,Elements,n p Members,n + π P air p

Pair,Elements,n p Pair,n + π Array p

Array,Elements,n p Array,n + π Elements p

Elements,Elements,n p Elements,n + π V alue p

V alue,Elements,n p V alue,n p ≤ π Object p

Object,V alue,n p Object,n + π Members p

Members,V alue,n p Members,n + π P air p

Pair,V alue,n p Pair,n + π Array p

Array,V alue,n p Array,n + π Elements p

Elements,V alue,n p Elements,n + π V alue p

V alue,V alue,n p V alue,n π Object + π Members + π P air + π Array + π Elements + π V alue = 1

Using a slightly modiﬁed version of the Hoa tool ([25]),the computation of the probabilities p X,n and p X,Y,n for all

X, Y ∈ Γ and n = 20 has been performed efﬁciently (a fewseconds). The system becomes as below, and we have then tosolve it while maximising p satisfying  p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue p ≤ π Object + π Members + π P air + π Array + π Elements + π V alue π Object + π Members + π P air + π Array + π Elements + π V alue = 1 his linear programming problem can be solved in anefﬁcient way, using simplex-like approaches. We have usedthe tool lp solve to solve it, and the result is that p = 1 if π Object = 0 , π Members = 0 , π P air = 0 , π Array = 0 , π Elements = 1 , and π V alue = 0 . It means that, for thissimple example, the optimised approach to cover all the non-terminals symbols, consists in generating derivation trees cov-ering

Elements . Indeed, in this grammar, the generation ofa derivation tree covering the non-terminal symbol

Elements provides a tree covering all the other non-terminal symbols.VI. C

ONCLUSION

In this paper, we have presented a method for exploiting acoverage criterion together with random testing in the contextof grammar-based testing. This automatic method lies inbuilding a grammar and then in resolving a linear constraintsystem, which can be done by adapted tools, even for largevalues. In the future, we plan to extend the approach to othercoverage criteria such as rules coverage, and also to handleattribute grammars with constraints formalising the semanticsof context-free languages.R

EFERENCES[1] A. Denise, M.-C. Gaudel, S.-D. Gouraud, R. Lassaigne, J. Oudinet, andS. Peyronnet, “Coverage-biased random exploration of large models andapplication to testing,”

STTT , vol. 14, no. 1, pp. 73–93, 2012.[2] P. Purdom, “A sentence generator for testing parsers,”

BIT , vol. 12, no. 3,pp. 366–375, 1972.[3] B. Daniel, D. Dig, K. Garcia, and D. Marinov, “Automated testingof refactoring engines,” in

ESEC/FSE 2007: Proceedings of the ACMSIGSOFT Symposium on the Foundations of Software Engineering .New York, NY, USA: ACM Press, September 2007.[4] D. Coppit and J. Lian, “Yagg: an easy-to-use generator for structuredtest inputs,” in

ASE , D. F. Redmiles, T. Ellman, and A. Zisman, Eds.ACM, 2005, pp. 356–359.[5] R. L¨ammel and W. Schulte, “Controllable combinatorial coverage ingrammar-based testing,” in

TestCom , ser. LNCS, M. Uyar, A. Duale,and M. Fecko, Eds., vol. 3964. Springer, 2006, pp. 19–38.[6] R. Majumdar and R.-G. Xu, “Directed test generation using symbolicgrammars,” in

ASE , R. E. K. Stirewalt, A. Egyed, and B. F. 0002, Eds.ACM, 2007, pp. 134–143.[7] P. Godefroid, A. Kiezun, and M. Levin, “Grammar-based whiteboxfuzzing,” in

PLDI , R. Gupta and S. P. Amarasinghe, Eds. ACM, 2008,pp. 206–215.[8] Z. Xu, L. Zheng, and H. Chen, “A toolkit for generating sentences fromcontext-free grammars.” in

Software Engineering and Formal Methods ,ser. IEEE, 2010, pp. 118–122.[9] R. L¨ammel, “Grammar testing,” in

FASE , ser. Lecture Notes in ComputerScience, H. Hußmann, Ed., vol. 2029. Springer, 2001, pp. 201–216.[10] L. Zheng and D. Wu, “A sentence generation algorithm for testinggrammars,” in

COMPSAC (1) , S. Ahamed, E. Bertino, C. Chang,V. Getov, L. Liu, H. Ming, and R. Subramanyan, Eds. IEEE ComputerSociety, 2009, pp. 130–135.[11] T. Alves and J. Visser, “A case study in grammar engineering,” in

SLE ,ser. Lecture Notes in Computer Science, D. Gasevic, R. L¨ammel, andE. V. Wyk, Eds., vol. 5452. Springer, 2008, pp. 285–304.[12] J. Duran and S. Ntafos, “A report on random testing,” in

ICSE ’81:Proceedings of the 5th international conference on Software engineering .Piscataway, NJ, USA: IEEE Press, 1981, pp. 179–183.[13] R. Hamlet, “Random testing,” in

Encyclopedia of Software Engineering .Wiley, 1994, pp. 970–978.[14] P. Godefroid, N. Klarlund, and K. Sen, “DART: directed automatedrandom testing,” in

PLDI’05: Proceedings of the 2005 ACM SIGPLANconference on Programming language design and implementation . NewYork, NY, USA: ACM, 2005, pp. 213–223. http://lpsolve.sourceforge.net/ [15] C. Oriat, “Jartege: A tool for random generation of unit tests forjava classes,” in QoSA/SOQUA , ser. Lecture Notes in Computer Sci-ence, R. Reussner, J. Mayer, J. Stafford, S. Overhage, S. Becker, andP. Schroeder, Eds., vol. 3712. Springer, 2005, pp. 242–256.[16] B. McKenzie, “Generating string at random from a context-free gram-mar,” Univ ersity of Canterbury, Tech. Rep. TR-COSC 10/97, 1997.[17] T. J. Hickey and J. Cohen, “Uniform random generation of strings in acontext-free language,”

SIAM J. Comput. , vol. 12, no. 4, pp. 645–655,1983.[18] P. Maurer, “The design and implementation of a grammar-based datagenerator,”

Softw., Pract. Exper. , vol. 22, no. 3, pp. 223–244, 1992.[19] P.-C. H´eam and C. Nicaud, “Seed: An easy-to-use random generator ofrecursive data structures for testing,” in

ICST . IEEE Computer Society,2011, pp. 60–69.[20] F. Dadeau, J. Levrey, and P.-C. H´eam, “On the use of uniform randomgeneration of automata for testing,”

Electr. Notes Theor. Comput. Sci. ,vol. 253, no. 2, pp. 37–51, 2009.[21] P.-C. H´eam and C. Masson, “A random testing approach using pushdownautomata,” in

TAP , ser. Lecture Notes in Computer Science, M. Gogollaand B. Wolff, Eds., vol. 6706. Springer, 2011, pp. 119–133.[22] I. Enderlin, F. Dadeau, A. Giorgetti, and F. Bouquet, “Grammar-based testing using realistic domains in php,” in

ICST , G. Antoniol,A. Bertolino, and Y. Labiche, Eds. IEEE, 2012, pp. 509–518.[23] P. Flajolet and R. Sedgewick,

Analytic Combinatorics . CambridgeUniversity Press, 2008.[24] A. Denise and P. Zimmermann, “Uniform random generation of decom-posable structures using ﬂoating-point arithmetic,”

Theor. Comput. Sci. ,vol. 218, no. 2, pp. 233–248, 1999.[25] I. Enderlin, “Hoa project, a set of php libraries.”