Classifying Pattern and Feature Properties to Get a Θ(n) Checker and Reformulation for Sliding Time-Series Constraints
Nicolas Beldiceanu, Mats Carlsson, Claude-Guy Quimper, Maria-Isabel Restrepo-Ruiz
aa r X i v : . [ c s . F L ] D ec Classifying Pattern and Feature Propertiesto Get a Θ ( n ) Checker and Reformulationfor Sliding Time-Series Constraints
N. Beldiceanu , M. Carlsson , C.-G. Quimper and M. I. Restrepo TASC (LS2N-CNRS), IMT Atlantique, FR – 44307 Nantes, France RISE SICS, Sweden Laval University, Québec, Canada
Abstract.
Given, a sequence X of n variables, a time-series constraint ctr using the Sum aggregator, and a sliding time-series constraint en-forcing the constraint ctr on each sliding window of X of m consecutivevariables, we describe a Θ ( n ) time complexity checker, as well as a Θ ( n ) space complexity reformulation for such sliding constraint. While sequence constraints on sliding windows were introduced a long time agofor counting and for sum constraints, e.g. see among_seq in [4,16,8] and slid-ing_sum in [5,13], no sliding automaton constraint was yet introduced, evenif automaton constraints were known since 2004 [7,14]. More recently in thecontext of planning problems, constraints on streams were introduced in [11,12]for comparing pointwise two stream variables or for stating constraints adaptedfrom Linear Temporal Logic. However, in the context of a long sequence or of adata stream [1], imposing a constraint on a full sequence does not make muchsense, as we rather want to focus on sliding windows. Compositional time-seriesconstraints combining a regular expression σ , a feature f , and an aggregator g were introduced in [6,2]. We first provide an example of sliding time seriesconstraint. Example 1.
Given a sequence X = 3 1 3 3 2 1 1 2 2 2 4 4 3 1 2 2 , we want to computethe sum of subsequences of X corresponding to increasing sequences, i.e. tomaximal occurrences of the pattern ‘ < ( < | =) ∗ < | < ’, in every window of size of X . Such windows are shown in the figure on the right by a dotted line,where each solid line-segment indicatesan increasing sequence. The number tothe left of each window is the sum ofthe elements of the window belongingto an increasing sequence located insidethe window. Beyond this example wewant a generic approach to deal with avariety of patterns and features. > < = > > = < = = < = > > < = ontributions and methodology Our contributions are threefold. – By pursuing the compositional style for defining time-series constraints [6],we introduce sliding time-series constraints, assuming g is the Sum aggrega-tor. This allows one to define a fair variety of sliding constraints in a genericway, in fact constraints in the time-series catalogue [2]. – It provides a Θ ( n ) linear time complexity checker for such constraints, whichis crucial when extracting patterns from long sequences in the context ofmodel acquisition [15,18]. – It describes a Θ ( n ) linear space complexity reformulation, which allows amemory efficient reformulation.To obtain our contributions we use the following methodology. – We come up with three simple equations allowing one to compute the contri-bution of a window [ i, j ] (with i ≤ j ) wrt the results (a) on the full sequence X = x x . . . x n , (b) on the prefix x x . . . x j which ends at position j , and(c) on the suffix x i x i +1 . . . x n which starts at position i . – We study both the properties of regular expressions and features: • We systematically categorise regular expressions by partitioning theirwords into a restricted set of classes, so that each regular expression canbe compactly represented by a finite set of classes. • We identify key pattern and feature properties.For each pair of word classes and feature properties, we prove that a givenequation holds or provide some counterexample. – Finally, we show how equations can be directly turned into checkers andreformulations.The categorisation of a regular expression and the identification of the proper-ties of a pattern are done mechanically by checking that some derived regularlanguages are empty or not.Section 2 provides the necessary background on words and time-series con-straints. Section 3 introduces a small number of pattern properties, while Sec-tion 4 (i) defines the sliding time-series constraints we consider, (ii) classifiesregular expressions in relation to sliding windows, (iii) shows how to computethe contribution of a sliding window based on pattern and feature properties, andfinally (iv) presents a Θ ( n ) time complexity checker and a Θ ( n ) space complexityreformulation for such sliding time-series constraints. Word
Consider a finite alphabet Σ . A word w over Σ is a sequence of letters w w . . . w ℓ of the alphabet Σ , and its length ℓ is denoted by | w | . The emptyword is denoted by ǫ . The reverse of w is the word w ℓ w ℓ − . . . w denoted w r .The concatenation of two words is denoted by putting them side by side. A word v is a factor of a word x if there exists two words u and w such that x = uvw ;when u = ǫ , v is a prefix of x , when w = ǫ , v is a suffix of x . If v is not emptyand different from x , then v is a proper factor of x .2 ime-series constraints We assume the reader is familiar with regular expres-sions and automata [10]. A time-series constraint g _ f _ σ ( r, X ) is a constraintwhich restricts an integer result variable r to be the result of some computationsover a sequence of integer variables X . The components of a time-series constraintwe reuse from [6] are a pattern σ , a feature f , and an aggregator g . A pattern σ is described by a regular expression over the alphabet Σ = { ‘ < ’ , ‘ = ’ , ‘ > ’ } whose language L σ does not contain the empty word, and by two non-negativeintegers b σ and a σ , where b σ + a σ is smaller than or equal to the length of thesmallest word of L σ . A feature and an aggregator are functions over integer se-quences as illustrated in Table 1. Note that all functions f and g introducedin Table 1 are commutative. Let S = s s . . . s n − be the signature of a time se-ries X , which is defined by constraints: ( x i < x i +1 ⇔ s i = ‘ < ’ ) ∧ ( x i = x i +1 ⇔ s i = ‘ = ’ ) ∧ ( x i > x i +1 ⇔ s i = ‘ > ’ ) for all i ∈ [1 , n − . If a sub-signature s i s i +1 . . . s j − is a maximal word matching σ in the signature of X , then thesubsequence x i + b σ x i + b σ +1 . . . x j − a σ is called a σ -pattern wrt X , and the sub-sequence x i x i +1 . . . x j is called an extended σ -pattern wrt X . The non-negativeintegers b σ and a σ trim the left and right borders of an extended σ -pattern toobtain a σ -pattern from which a feature value is computed. f value one width j − i − b σ − a σ + 1 surf j − aσ P k = i + bσ x k max max k ∈ [ i + bσ,j − aσ ] x k min min k ∈ [ i + bσ,j − aσ ] x k g value Sum c P k =1 f k σ L σ b σ a σ r n o e s Inflexion < ( < | =) ∗ > | > ( > | =) ∗ < n n y n n BumpOnDecSeq >><>> n n n n n
DipOnIncSeq <<><< n n n n n
Dec > y y n y y Inc < y y n y y Steady = 0 0 y y n y y
DecTerrace > = + > y y n n n IncTerrace < = + < y y n n n Plain > = ∗ < y n y n n Plateau < = ∗ > y n y n n ProperPlain > = + < y n y n n ProperPlateau < = + > y n y n n Gorge ( > ( > | =) ∗ ) ∗ >< (( < | =) ∗ < ) ∗ y n y n n Summit ( < ( < | =) ∗ ) ∗ <> (( > | =) ∗ > ) ∗ y n y n n Peak < ( < | =) ∗ ( > | =) ∗ > y n y n n Valley > ( > | =) ∗ ( < | =) ∗ < y n y n n DecSeq > ( > | =) ∗ > | > y y n y n IncSeq < ( < | =) ∗ < | < y y n y n SteadySeq = + y y n y n StrictlyDecSeq > + y y n y n StrictlyIncSeq < + y y n y n Zigzag ( <> ) + < ( > | ǫ ) | ( >< ) + > ( < | ǫ ) 1 1 y n n n n Table 1: Consider a sequence x x . . . x n . (Top left) features f with their valuescomputed from an extended σ -pattern x i x i +1 . . . x j ; (Bottom left) aggregator g = Sum , its value computed from a sequence of feature values f , f , . . . , f c ;(Right) patterns σ = hL σ , b σ , a σ i grouped by the properties they share, wherecolumns r , n , o , e , s respectively indicate whether a pattern has a reverse in thecatalogue [3], the no-inflexion , the one-inflexion , the exclude-out-in , orthe single letter properties. 3n the following x i,j denotes the integer subsequence x i x i +1 . . . x j when i ≤ j and x i x i − . . . x j otherwise. The term f σ ( x i,j ) denotes the sum of the valuesof the feature f from every extended σ -pattern in subsequence x i,j , i.e. thecontribution of the sliding window [ i, j ] . We introduce a limited number of pattern properties that will be used to pa-rameterise our proofs: we will assume that some of these properties hold to provethat a given equation is valid for calculating the contribution of a sliding window. Definition 1.
The mirror of a regular language L over Σ = { ‘ < ’ , ‘ = ’ , ‘ > ’ } ,denoted by L mir , consists of the mirrors of all the words in L , where the mirrorof a word w , denoted by w mir , has the reverse order of its letters and has alloccurrences of the letter ‘<’ flipped into ‘>’ and vice versa. Definition 2.
Two patterns σ = hL σ , b σ , a σ i and σ r = hL σ r , b σ r , a σ r i are the reverse of each other iff w ∈ L σ ⇔ w mir ∈ L σ r , a σ = b σ r and b σ = a σ r . As shown by column r of the pattern part of Table 1, out of the patternsof the time-series catalogue [3] have a reverse pattern defined inside [3]. Example 2 (reverse).
On the one hand, the
Plateau = h ‘ < = ∗ > ’ , , i patternis the reverse of itself since, (1) all letters except the first and last letters ofa plateau correspond to the letter ‘=’, (2) the first letter ‘<’ is the mirror ofthe last letter ‘>’, and (3) a Plateau = b Plateau = 1 . On the other hand, the
Inflexion = h ‘ < ( < | =) ∗ > | > ( > | =) ∗ < ’ , , i pattern is not the reverseof itself: the mirror of the word ‘<<>’ ∈ L Inflexion , i.e. the word ‘<>>’, is notan inflexion since it ends with two occurrences of ‘>’ rather than one.
Definition 3.
A pattern σ has the convexity property if for any word w = s s . . . s n − in L σ and for any pair of factors u = s c s c +1 . . . s d and v = s e s e +1 . . . s f of w (with c, d, e, f ∈ [1 , n − ) such that, both u and v are wordsin L σ , the word s min( c,e ) s min( c,e )+1 . . . s max( d,f ) is also in L σ .Example 3 ( convexity property). All patterns of the time series catalogue [3]have the convexity property, but the pattern whose language is denoted by L < = > = | < = | = > has not, since the word ‘<=>=’ in L < = > = | < = | = > contains afactor ‘<=>’ that is not in L < = > = | < = | = > , for which both the prefix ‘<=’ andthe suffix ‘=>’ belong to L < = > = | < = | = > . Definition 4.
A pattern σ has the no-inflexion property if any word in itslanguage L σ does not simultaneously contain the letters ‘<’ and ‘>’ . Through an abuse of language and for reasons of brevity we say “pattern propertyof σ ” rather than “property of the language L σ of the pattern σ ”. efinition 5. A pattern σ has the one-inflexion property if any word inits language L σ contains either one, but not both occurrences of ‘<=*>’ and ‘>=*<’ . Definition 6.
A pattern σ has the single letter property if all words of L σ have a length of one. Definition 7.
A pattern σ has the exclude-out-in property if for any word s s . . . s n − in L σ and for any window [ i, j ] (with ≤ i ≤ j ≤ n and i > ∨ j
Steady have the single letter property.
We introduce the sliding time-series constraint we consider.
Definition 8.
Given a feature f , a regular expression σ , an integer m > , twovariables low and up , and a sequence of variables X = x x . . . x n with n ≥ m ,the slide_sum_ f _ σ ( m, low , up , X ) constraint holds iff low = min i ∈ [1 ,n − m +1] r i , (1) up = max i ∈ [1 ,n − m +1] r i , (2) with sum_ f _ σ ( r i , x i,i + m − ) , where r i is called the contribution of thetime-series constraint sum_ f _ σ in the window [ i, i + m − . Cond. (1), (resp. (2)), of Def. 8 enforces low (resp. up ) to be the minimum(resp. maximum) of the sum of the feature values of feature f wrt all maximaloccurrences of σ in each subsequence of m consecutive variables of sequence X . Example 5 (Continuation of Example (1)).
Given the pattern
IncSeq and thefeature surf , slide_sum_surf_incseq (10 , , , is sat-isfied because the sum of the surfaces of the increasing sequences in the differentsliding windows of size is between and as shown in Example 1.5 .1 Computing the Contribution in a Window In this section, we consider the patterns σ and σ r which are the reverseof each other, a feature f , an integer sequence x x . . . x n , and all windows x i x i +1 . . . x i + m − of size m (with i ∈ [1 , n − m + 1] ). We investigate how toevaluate directly from an equation the sum of the feature values of feature f ofall pattern occurrences located in a window [ i, j = i + m − , assuming all theelements of the right-hand side of an equation have been previously calculated intime proportional to n . As we have several features and several patterns, we usethree equations all derived from the same simple idea, for which we first presentthe intuition. Then we define sufficient properties of features and patterns thatensure the validity of each of the three equations (4), (5) and (6). At the endof this section, Table 4 provides an overview of the validity of each of the threeequations according to properties of the patterns and features. Intuition
Assume we want to deal with the following simplified problem: givenan integer sequence x x . . . x n , compute for all subsequences of m consecutivepositions the sum t i,j = Σ k ∈ [ i,j ] x k (with j = i + m − ) of the correspondingelements in time O ( n ) . This can be done by first computing the partial sums Σ c ∈ [1 ,k ] x c (with k ∈ [1 , n ] ), and Σ c ∈ [ k,n ] x c (with k ∈ [1 , n ] ) and by using theidentity t i,j = Σ k ∈ [1 ,j ] x k + Σ k ∈ [ i,n ] x k − Σ k ∈ [1 ,n ] x k . (3)Equations (4), (5) and (6) present three alternative ways to compute f σ ( x i,j ) inspired by Equation (3). f σ ( x i,j ) = f σ ( x ,j ) + f σ r ( x n,i ) − f σ ( x ,n ) (4) f σ ( x i,j ) = max (0 , f σ ( x ,j ) + f σ r ( x n,i ) − f σ ( x ,n )) (5)if no σ -pattern in x i,j then f σ ( x i,j ) = 0 else f σ ( x i,j ) = f σ ( x ,j ) + f σ r ( x n,i ) − f σ ( x ,n ) (6)Depending on the properties of the pattern σ and of the feature f , we inves-tigate the cases when Equations (4), (5) and (6) are valid. Example 6.
Consider the
DecSeq pattern of Table 1, the sequence w = 2 1 1 1 0 ,the window size m = 2 , i.e. the four sliding windows , , and . • Equation (4) provides the incorrect surf feature value for two of the foursliding windows, namely values , − , − and rather than the expectedvalues , , and . For the second window, shown in grey on the figure onthe right, this is because there is a non-empty gap (shown in red) betweenthe leftmost and rightmost decreasing sequences in w . Equation (5)gives the correct value since it cancels out the contribution of the gap. − • While Equations (4) and (5) give the incorrect min feature value for two ofthe four sliding windows, namely values , , and rather than values , , and , Equation (6) provides the correct values.6 ase condition illustration (1) u < i i jℓ u (2) ℓ > j i j ℓ u (3) i ≤ ℓ ≤ u ≤ j i jℓ u (4) ℓ < i ≤ u ≤ j ∧ p ( s i,u − ) ∈ L σ i jℓ uα (5) ℓ < i ≤ u < j ∧ p ( s i,u − ) / ∈ L σ i jℓ u (6) i ≤ ℓ ≤ j < u ∧ s ( s ℓ,j − ) ∈ L σ i jℓ uβ (7) i < ℓ ≤ j < u ∧ s ( s ℓ,j − ) / ∈ L σ i jℓ u (8) ( ℓ ≤ i ≤ j ≤ u ) ∧ ( ℓ = i ∨ j = u ) i jℓ u Table 2: Positioning an occurrence of a pattern wrt a window; within cases (4)and (6) the non-empty words p ( s i,u − ) and s ( s ℓ,j − ) are shown in light grey. Case Analysis
Consider a sequence x x . . . x n , a window [ i, j ] , and a maximaloccurrence of pattern o whose signature is s ℓ s ℓ +1 . . . s u − (with ≤ ℓ ≤ u ≤ n ).Table 2 provides eight cases summarising all the possible positioning of x ℓ,u wrt [ i, j ] , where p ( s i,u − ) (resp. s ( s ℓ,j − ) ) denotes the longest prefix s i s i +1 . . . s α − of s i s i +1 . . . s u − (resp. the longest suffix s β s β +1 . . . s j − of s ℓ s ℓ +1 . . . s j − ) in L σ ifsuch word exists, the empty word otherwise. For cases (1–7) of Table 2, columns f σ ( x ,j ) (resp. f σ r ( x n,i ) ) and f σ ( x ,n ) of Table 3 provide the feature valueof the σ -pattern occurrence o (resp. σ r -pattern occurrence o r ) wrt x x . . . x j (resp. x n x n − . . . x i ) and x x . . . x n ; the last three columns give the contribu-tion of o in the right-hand side of Equations (4), (5), (6). These contributionsagree with the positioning of x ℓ,u wrt [ i, j ] , except for the three grey cells, whichonly work for non-negative feature values. Case (8) of Table 2 corresponds to amaximal occurrence of pattern o whose signature starts before i and ends after j . To study Case (8), the next section classifies a pattern wrt a window. case f σ ( x ,j ) f σ r ( x n,i ) f σ ( x ,n ) Eq. (4) Eq. (5) Eq. (6) (1) f σ ( x ℓ,u ) 0 f σ ( x ℓ,u ) 0 0 0 (2) f σ r ( x u,ℓ ) f σ ( x ℓ,u ) 0 0 0 (3) f σ ( x ℓ,u ) f σ r ( x u,ℓ ) f σ ( x ℓ,u ) f σ r ( x u,ℓ ) max(0 , f σ r ( x u,ℓ )) f σ r ( x u,ℓ ) (4) f σ ( x ℓ,u ) f σ r ( x α,i ) f σ ( x ℓ,u ) f σ r ( x α,i ) max(0 , f σ r ( x α,i )) f σ r ( x α,i ) (5) f σ ( x ℓ,u ) 0 f σ ( x ℓ,u ) 0 0 0 (6) f σ ( x β,j ) f σ r ( x u,ℓ ) f σ ( x ℓ,u ) f σ ( x β,j ) max(0 , f σ ( x β,j )) f σ ( x β,j ) (7) f σ r ( x u,ℓ ) f σ ( x ℓ,u ) 0 0 0 Table 3: [ columns 2 to 4 ] values of f σ ( x ,j ) , f σ r ( x n,i ) and f σ ( x ,n ) wrtcases (1 −
7) of Table 2; [ columns 5 to 7 ] contribution of an occurrence of σ in a window wrt the right-hand side of Equations (4), (5) and (6).7 Systematic Classification of Patterns wrt WindowsDefinition 9. [type of a word wrt a pattern]
Given a pattern σ , the type of aproper factor w = w w . . . w k of a word in L σ wrt σ is defined by five mutuallyincompatible conditions: • out if ∄ c, d : 1 ≤ c ≤ d ≤ k ∧ w c w c +1 . . . w d ∈ L σ • fac if ∃ c, d : 1 ≤ c ≤ d ≤ k ∧ w c w c +1 . . . w d ∈ L σ ∄ d : 1 ≤ d ≤ k ∧ w w . . . w d ∈ L σ ∄ c : 1 ≤ c ≤ k ∧ w c w c +1 . . . w k ∈ L σ • pre if ( ∃ d : 1 ≤ d ≤ k ∧ w w . . . w d ∈ L σ ∄ c : 1 ≤ c ≤ k ∧ w c w c +1 . . . w k ∈ L σ • suf if ( ∃ c : 1 ≤ c ≤ k ∧ w c w c +1 . . . w k ∈ L σ ∄ d : 1 ≤ d ≤ k ∧ w w . . . w d ∈ L σ • in if ( ∃ d : 1 ≤ d ≤ k ∧ w w . . . w d ∈ L σ ∃ c : 1 ≤ c ≤ k ∧ w c w c +1 . . . w k ∈ L σ In Definition 9, “ fac ”, “ pre ” and “ suf ” convey the idea of “factor”, “prefix”and “suffix”. Note that a word with the “ in ” type wrt a convex pattern σ is in L σ . The languages associated with the five mutually incompatible conditions ofDefinition 9 are defined as L out = Σ + \ ( Σ ∗ L σ Σ ∗ ) , L fac = Σ + L σ Σ + ∩ Σ ∗ \ ( L σ Σ + ) ∩ Σ ∗ \ ( Σ + L σ ) ∩ Σ ∗ \ L σ , L pre = L σ Σ + ∩ Σ ∗ \ ( Σ + L σ ) ∩ Σ ∗ \ L σ , L suf = Σ + L σ ∩ Σ ∗ \ ( L σ Σ + ) ∩ Σ ∗ \L σ , and L in = L σ Σ ∗ ∩ Σ ∗ L σ . Note that becauseof our hypothesis that L σ does not contain the empty word, the languages L out , L fac , L pre , L suf and L in do not contain the empty word. Definition 10. [type and signature of a word wrt one of its proper factors andwrt a pattern]
Given a pattern σ , consider a word w = w w . . . w k of L σ , andone of its proper factors v = w i w i +1 . . . w j . The type of w wrt v and σ is definedby h t , t , t i where t , t and t are respectively the type of words w w . . . w j , w i w i +1 . . . w j and w i w i +1 . . . w k wrt pattern σ as defined by Definition 9. The signature of w wrt v and σ is defined by h sig , sig , sig i , where sig c = if t c =out then else (with c ∈ [1 , ). Theorem 1. [map of feasible types wrt any pattern]
Of the possible typesof Definition 10, only triples shown in Figure 1 are feasible.Proof. For each triple of Figure 1, Appendix A provides a witness pattern whichgenerates such triple. We now prove that the missing triples cannot be obtainedfrom any pattern. ① h out , = out , - i (resp. h - , = out , out i ) is not feasible since an “ out ” in v = w w . . . w j (resp. w i w i +1 . . . w k ) would imply an “ out ” in any subsequenceof v , namely in w i w i +1 . . . w j , a contradiction. ② h fac , suf , - i (resp. h - , pre , fac i ) is not feasible since a “ suf ” (resp. “ pre ”) in w i w i +1 . . . w j would imply a “ suf ” (resp. “ pre ”) or an “ in ” in w w . . . w j (resp. w i w i +1 . . . w k ), a contradiction.8 h fac , in , - i (resp. h - , in , fac i ) is not feasible since a “ in ” in w i w i +1 . . . w j wouldimply a “ suf ” (resp. “ pre ”) or an “ in ” in w w . . . w j (resp. w i w i +1 . . . w k ), acontradiction. ④ h pre , suf , - i (resp. h - , pre , suf i ) is not feasible since a “ suf ” (resp. “ pre ”)in w i w i +1 . . . w j and a “ pre ” (resp. “ suf ”) in v = w w . . . w j (resp. v = w i w i +1 . . . w k ) would imply an “ in ” in v , a contradiction. ⑤ h pre , in , - i (resp. h - , in , suf i ) is not feasible since an “ in ” in w i w i +1 . . . w j anda “ pre ” (resp. “ suf ”) in v = w w . . . w j (resp. v = w i w i +1 . . . w k ) wouldimply an “ in ” in v , a contradiction. ⊓⊔ (A) (B) (C)(D) ooi oosoop oofioo soopoo fooooo ioi iosiop soipoisoppop sosposposiof ioffoifoisof sofpof poffopfop fosfos fof poffop soffosfoi iofiiiiii siiiip isiipiispipp ssispisssississisfppp ppippi fpissfsppsipspp sspsipsspfppiffffiifsifspfipfi sfisfiifiifiifpifp ifi sfipfi ifsifp sfspfspfs sfppfpffsffiffp sffiffpfffff pff iff sff ffpffiffs sipssp sppsss pppssf fppsfpsffsfs ffppfp Fig. 1: Map of the feasible types where Parts (A), (B), (C) and (D) resp.correspond to triples with three “ out ”, two “ out ”, one “ out ” and no “ out ”, where o , i , p , s , f resp. are abbreviations for “ out ”, “ in ”, “ pre ”, “ suf ”, “ fac ”; there is anarc from a triple t to a triple t iff (1) t and t have the same signature, (2) t and t differ exactly from one position, (3) t is lexicographically less than t assuming i ≺ p , i ≺ s , p ≺ f and s ≺ f ; ellipses denote the types of the DecSeq pattern as described in Example 7.
Notation
In the context of Definition 10, when w w . . . w j is of type pre or in , i.e. it contains a maximal occurrence of a word x in L σ starting at index , ψ denotes the index of the last letter of x . Similarly, when w i w i +1 . . . w k is oftype suf or in , i.e. it contains a maximal occurrence of a word y in L σ ending t index k , λ denotes the index of the first letter of y . The number of triples being important, in our case, we reduce the numberof cases to be considered in our proofs, by introducing Definition 11 which groupsa certain number of triples in the same class representing the weakest hypothesisassociated with the different triples in this class. Consider the (finite) set S oftriples associated with the words of L σ wrt their proper factors. We partition theset S into subsets where all triples of the same subset have the same signature.Then, we generalise all the triples that belong to the same subset to a uniquerepresentative using the following definition. Definition 11. [generalising a set of triples]
Given a pattern σ , consider the setof triples S consisting of all types of the words of L σ wrt their proper factorsand wrt σ that have the same signature. Let S c (with c ∈ [1 , ) denote the setof all the c th components of the triples of S . The set S is represented by a single representative triple R S = h r , r , r i where r c , with c ∈ [1 , , is defined by • S c = { out } ⇒ r c = out • fac ∈ S c ⇒ r c = fac • pre ∈ S c ∧ suf / ∈ S c ∧ fac / ∈ S c ⇒ r c = pre • suf ∈ S c ∧ pre / ∈ S c ∧ fac / ∈ S c ⇒ r c = suf • pre ∈ S c ∧ suf ∈ S c ∧ fac / ∈ S c ⇒ r c = ps • in ∈ S c ∧ pre / ∈ S c ∧ suf / ∈ S c ∧ fac / ∈ S c ⇒ r c = in Definition 12. [pattern class]
Given a pattern σ , the set of representative triplesof σ is called the class of σ .Example 7. The set S of possible types associated with the DecSeq pattern h ‘ > ( > | =) ∗ > | > ’ , , i is equal to the union of two subsets S = {h pre , fac , suf i , h pre , pre , in i , h in , suf , suf i , h in , in , in i} and S = {h pre , out , suf i} , where eachsubset corresponds to triples for which all “ out ” are located in the same positions,see the five ellipses in Parts (D) and (C) of Figure 1. Part (A) of Figure 2 gives foreach element of S and S a corresponding example of a word and a proper factor.The sets S and S are respectively represented by the triples h pre , fac , suf i and h pre , out , suf i as shown in Part (B) of Figure 2. Finally, Figure 3 providesthe representative triples for all reversible and convex patterns of Table 1, whichdo not have the single letter property. Finding all the representatives of a pattern
To generate all the represen-tatives of a pattern σ , (i) we first generate all potential word types wrt σ , and (ii) we then use Definition 11. For each of the potential word type h t , t , t i with t i ∈ { out , fac , pre , suf , in } depicted in Figure 1, we describe a systematicmethod to check whether there exists or not a word w = w w . . . w k of L σ whosetype is h t , t , t i . For this purpose we define the language of h t , t , t i wrt to σ and check whether it is empty or not. Since we need the prefix of w associatedwith t to overlap the suffix of w associated with t , we first introduce the notionof shuffle language . 10 A)(B)(C) * pre , fac , suf + ① λ ψ> = >> = >> = >> = >> = >> = > h pre , fac , suf i z }| {* pre , pre , in + ψλ ② > = >> = >> = > * in , suf , suf + ③ ψλ> = >> = >> = > * in , in , in + ④ ψλ>>>>>> h pre , out , suf i z }| {* pre , out , suf + ⑤ > = >> = >> = > ① > (= | > ) ∗ s (= + > + ) + = + s = ∗ > + (= + > + ) ∗ ② ( > (= | > ) ∗ | ǫ ) s ( > + = + ) + s = ∗ > + (= + > + ) + ③ > (= | > ) ∗ s (= + > + ) + s > ∗ (= + > + ) ∗ ④ s > + (= + > + ) ∗ s = ∗ > + (= + > + ) ∗ | > (= | > ) ∗ s > + (= + > + ) ∗ s > ∗ (= + > + ) ∗ ⑤ > (= | > ) ∗ s = + s = ∗ > + (= + > + ) ∗ Fig. 2: (A) Set of possible types of words wrt their proper factors of the
DecSeq pattern with the corresponding examples of word w and proper factor (in grey)where ψ (resp. λ ) denotes the end (resp. start) of a maximal word in L DecSeq starting at the first position (resp. ending at the last position) of w , (B) corre-sponding set of representative triples, and (C) languages of the types of words ① , ② , ③ , ④ , ⑤ as computed from Equation (7) of Theorem 2. Definition 13. [shuffle language]
Given a regular language L over an inputalphabet Σ , and a possibly new input letter s , i.e. a letter that does not necessarilybelong to Σ , the shuffle language of L wrt s , denoted shuffle ( L , s ) , is defined byall words w over the alphabet Σ ∪ { s } such thati) w contains at least one occurrence of the letter s ,ii) if we remove one single occurrence of the letter s from w then the resultingword belongs to L . Theorem 2. [language of a word type]
Given a pattern σ and one of its potentialword types h t , t , t i , the language associated with h t , t , t i is defined by \ shuffle ( shuffle ( L σ , s ) , s ) shuffle ( L t , s ) sΣ ∗ Σ ∗ s L t sΣ + | Σ + s L t sΣ ∗ Σ ∗ s shuffle ( L t , s ) (7) Proof.
The four sub-expressions on the right-hand side of (7) respectively corre-spond to a word of L σ to which two occurrences of s are inserted, and in threeways of decomposing it wrt its prefix, to its window, and to its suffix. The let-ter s is used to “synchronise” these decompositions, i.e. to enforce a non-emptyintersection between the prefix and the suffix. Since L t does not contain theempty word, the two occurrences of s delimit a non-empty window. ⊓⊔ Example 8 (Continuation of Example (7)).
Part (C) of Figure 2 gives the lan-guages of the types of words h pre , fac , suf i , h pre , pre , in i , h in , suf , suf i , h in , in , in i ,11nd h pre , out , suf i for the DecSeq pattern, as defined by Theorem 2. Note thatall other triples lead to the empty language.Evaluating whether the language associated with a regular expression isempty or not (e.g. Expression (7)) is done by (i) converting all its operatorinstances (e.g. union, intersection, concatenation, Kleene star, shuffle, . . . ) todeterministic finite automata, by (ii) evaluating the corresponding sequenceof operations on finite automata, and by (iii) checking whether the resultingminimised automaton has at least one accepting state or not. Following thismethodology, Appendix C gives the corresponding programs which compute therepresentatives and the properties of a pattern. We now show how to generatea finite automaton for the shuffle operator that we previously introduce. – [shuffle] From the deterministic and minimised automaton A L associatedwith L , one can build the automaton A shuffle ( L ,s ) associated with the lan-guage shuffle ( L , s ) by ( i ) duplicating all states of A L and make themnon-initial, ( ii ) make all states of A L non-accepting, ( iii ) add a transitionlabelled by s from each state to its duplicated state. Establishing the properties of a pattern
We now describe how to system-atically find the properties of a pattern σ . We use L ssss σ (resp. L ss σ ) as a shortcutfor shuffle ( shuffle ( shuffle ( shuffle ( L σ , s ) , s ) , s ) , s ) (resp. shuffle ( shuffle ( L σ , s ) , s ) ). • A pattern σ has the convexity property iff T L ss σ Σ ∗ s L σ Σ ∗ L σ s Σ ∗ Σ ∗ s ( Σ + \ L σ ) s Σ ∗ S T L ssss σ Σ ∗ s shuffle ( L σ , s ) s Σ + s Σ ∗ Σ ∗ s Σ + s shuffle ( L σ , s ) s Σ ∗ Σ ∗ s shuffle ( shuffle ( Σ + \ L σ , s ) , s ) s Σ ∗ = ∅ (8) h pre , out , suf ih pre , fac , suf ih pre , out , out ih out , out , suf ih out , out , out ih in , out , in ih in , in , in ih in , out , out ih out , out , in i IncSeqDecSeqGorge , SummitPeak , Valley SteadySeqStrictlyIncSeqStrictlyDecSeqDecTerraceIncTerracePlain , PlateauProperPlainProperPlateauZigzag
Fig. 3: Pattern classes, where each class corresponds to a set of representativetriples (an arrow from a triple ① to a triple ② means that ① generalises ② )12 A pattern σ has the no-inflexion property iff ( L σ ∩ Σ ∗ < Σ ∗ > Σ ∗ ) ∪ ( L σ ∩ Σ ∗ > Σ ∗ < Σ ∗ ) = ∅ (9) • A pattern σ has the one-inflexion property iff L σ \ L ( < | =) ∗ < = ∗ > ( > | =) ∗ | ( > | =) ∗ > = ∗ < ( < | =) ∗ = ∅ (10) • A pattern σ has the exclude-out-in property iff T L ssss σ Σ ∗ sΣ + s shuffle ( Σ + \ ( Σ ∗ L σ Σ ∗ ) , s ) s Σ ∗ Σ ∗ s shuffle ( L σ , s ) s Σ ∗ s Σ ∗ Σ ∗ s Σ + s Σ + s Σ ∗ s Σ ∗ S T L ssss σ Σ ∗ s shuffle ( Σ + \ ( Σ ∗ L σ Σ ∗ ) , s ) s Σ + s Σ ∗ Σ ∗ s Σ ∗ s shuffle ( L σ , s ) s Σ ∗ Σ ∗ s Σ ∗ s Σ + s Σ + s Σ ∗ = ∅ (11) • A pattern σ has the single letter property iff L σ \ L < | = | > = ∅ (12)As the constructions used in (8), (9), (10), (11) and (12) are similar to theone used in Theorem 2, they are not detailed. Proof of Equations Based on Pattern and Feature Properties.
In thissection we study the properties of patterns and features that ensure the validityof Equations (4), (5) and (6). While pattern properties were already introducedin Sections 3 and 4.1, we first present some feature properties. Second, we focuson the validity domain of Equation (4), and finally, based on these results, wederive the properties of the patterns and features for Equations (5) and (6).From now on we focus on commutative features, as well as reversible and convexpatterns, which do not have the single letter property.
Feature Properties
All definitions of this section, i.e. Definitions 14 to 19, aswell as all theorems of this section, i.e. Theorems 3 to 9, consider (i) a reversibleand convex pattern σ = hL σ , b σ , a σ i , (ii) a sequence of variables X = x x . . . x n , (iii) an extended σ -pattern occurrence in [1 , n ] given by o = h x ℓ x ℓ +1 . . . x u i with ≤ ℓ ≤ u ≤ n , and (iv) a commutative feature f applied to o .We first present four feature properties that only depend on the feature f .We then introduce two additional feature properties that depend on both thefeature f and the pattern σ . Finally, Part (A) of Figure 4 summarises the featureproperties of each of the features defined in the time-series catalogue [3]. Definition 14.
A feature f has the sum decomposition property if f σ ( x ℓ,u ) can be expressed as u − a σ P t = ℓ + b σ h ( x t ) , where h ( x t ) is a function. E.g., when f = width , h ( x t ) = 1 and the value returned by the application of f to the extended σ -pattern occurrence o is u − ℓ − b σ − a σ + 1 .13 efinition 15. A feature f has the same value property if f σ ( x ℓ,u ) = f σ ( x i,j ) for all i, j ( ℓ ≤ i ≤ j ≤ u ) such that the sequence x i,j alone is an extended σ -pattern occurrence. E.g., when f = one and x i,j is an extended σ -pattern, f σ ( x ℓ,u ) = f σ ( x i,j ) = 1 . Definition 16.
A feature f has the single position property if f σ ( x ℓ,u ) canbe expressed as h ( x t ) with x t ∈ { x ℓ + b σ , x ℓ + b σ +1 , ..., x u − a σ } . E.g., when f = max , h ( x t ) = x t and f σ ( x ℓ,u ) is the maximum of the variablesin x ℓ + b σ ,u − a σ . Definition 17.
A feature f has the positive property if f σ ( x i,j ) ≥ ∀ i, j ,such that ≤ i ≤ j ≤ n . Definition 18.
A feature f and a pattern σ have the single position no-in-flexion property if (i) f has the single position property, (ii) σ has the no-inflexion property and either (iii.a) for all extended σ -pattern occurrences x p,q wrt x p,q (with ℓ ≤ p ≤ q ≤ u ) f σ ( x p,q ) = h ( x p + b σ ) , or (iii.b) for all extended σ -pattern occurrences x p,q wrt x p,q (with ℓ ≤ p ≤ q ≤ u ) f σ ( x p,q ) = h ( x q − a σ ) . E.g., the pair σ = DecSeq , f = min , has the single position no-inflexion property since f σ ( x ℓ,u ) = x q − a σ , where q is the end of the extended σ -patternoccurrence in x ℓ,u . Definition 19.
A feature f and a pattern σ have the single position in-flexion property if (i) f has the single position property, (ii) σ has the one-inflexion property and (iii) f σ ( x ℓ,u ) is computed from the position of theonly inflexion of σ . E.g., the pair σ = Gorge , f = min has the single position inflexion propertysince the value of f σ ( x ℓ,u ) corresponds to the only inflexion of the extended σ -pattern occurrence. But the pair σ = Gorge , f = max , does not have the single position inflexion property since f σ ( x ℓ,u ) corresponds to one of thetwo extremities of the gorge.The next two sections define sufficient conditions where (i) Equation (4)and (ii)
Equations (5) and (6) can be used to compute the value of f σ ( x i,j ) wrt a pattern σ , depending on the representatives of a pattern. Part (B) ofFigure 4 summarises all the theorems introduced in these two sections wrt therepresentatives of Figure 3. Remark 1.
Wlog, while doing the proof of such conditions we proceed as follows: – When the representatives h pre , fac , suf i and h in , in , in i are both present,only h pre , fac , suf i is considered, since for h in , in , in i λ = i and ψ = j is aspecial case of h pre , fac , suf i . – Similarly, when the representatives h pre , out , out i and h in , out , out i (resp. h out , out , suf i and h out , out , in i ) both intervene in a proof, only h pre , out , out i (resp. h out , out , suf i ) is considered, as ψ = j (resp. λ = i ).14 feature properties one same value , positive width sum decomposition , positive surf sum decomposition max single position min single position h pre , fac , suf ih in , in , in ih pre , out , out ih out , out , suf ih in , out , out ih out , out , in ih out , out , out ih pre , out , suf ih in , out , in i Theo. 3,4 (Eq. 4)Theo. 5,6 (Eq. 4)Theo. 7, 8 (Eq. 5)Theo. 9 (Eq. 6) (A) (B)
Fig. 4: (A) Properties of the features defined in Table 1 and used in [3], (B) the-orems coverage for the different representatives of Figure 3. – When h in , out , out i and h out , out , in i (resp. h pre , out , out i and h out , out , suf i ) both intervene in a proof, only h in , out , out i (resp. h pre , out , out i ) is considered, as the representative h out , out , in i (resp. h out , out , suf i ) is symmetric. Sufficient Conditions for the Validity of Equation (4)
Theorem 3.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i} . Equation (4) can be used to obtain f σ ( x i,j ) for a sequence x ,n wrt a window [ i, j ] whose typeis in S , assuming that feature f has the sum decomposition property.Proof. From Remark 1, we just consider h pre , fac , suf i . We first establish prop-erties between the maximum words associated with pre , fac and suf . – From the convexity property, the signature of the words x ℓ,j , x i,j and x i,u respectively contain at most one maximum word in L σ . Because of pre , fac and suf , the signature of the words x ℓ,j , x i,j and x i,u respectively containat least one word in L σ . Consequently, the signatures of the words x ℓ,j , x i,j and x i,u contain one single maximum word in L σ , respectively denoted by w pre , w fac and w suf . – Because the words w pre and w fac must not end after position j , and fromthe convexity property, w pre and w fac end in the same position ψ . – Because the words w suf and w fac must not start before position i , and fromthe convexity property, w suf and w fac start at the same position λ . – Because the word w fac starts at position λ and ends at position ψ we havethat λ ≤ ψ .Since f has the sum decomposition property, by using the function h ofDefinition 14, Equation (4) can be rewritten as:15 ℓ i λ ψ j u n | {z } fσ ( x ,j )= fσ ( xℓ,j ) fσr ( xn,i )= fσr ( xu,i ) z }| { f σ ( x i,j ) = ψ − a σ X t = ℓ + b σ h ( x t ) | {z } f σ ( x ,j ) + u − b σr X t = λ + a σr h ( x t ) | {z } f σr ( x n,i ) − u − a σ X t = ℓ + b σ h ( x t ) | {z } f σ ( x ,n ) (13)By using the fact that the pattern σ is reversible (i.e. a σ r = b σ and b σ r = a σ )in the second term of Equation (13), by expanding the terms f σ r ( x n,i ) and f σ ( x ,n ) we obtain: ψ − a σ X t = ℓ + b σ h ( x t ) | {z } f σ ( x ,j ) + ψ − a σ X t = λ + b σ h ( x t ) + u − a σ X t = ψ − a σ +1 h ( x t ) | {z } f σr ( x n,i ) − ψ − a σ X t = ℓ + b σ h ( x t ) − u − a σ X t = ψ − a σ +1 h ( x t ) | {z } f σ ( x ,n ) = ψ − a σ X t = λ + b σ h ( x t ) = f σ ( x i,j ) . Hence, Equation (4) holds. ⊓⊔ Theorem 4.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i} . Equation (4) can be used to obtain f σ ( x i,j ) for a sequence x ,n wrt window [ i, j ] whose type isin S , assuming that f, σ has the single position no-inflexion property.Proof. Because of Remark 1 we only consider the representative h pre , fac , suf i .In this context, for the reason quoted in the first part of the proof of Theorem 3,the signature of x ℓ,j contains a maximum word in L σ ending at position ψ , thesignature of x i,j contains a maximum word in L σ starting at λ and ending at ψ ,the signature of x i,u contains a maximum word in L σ starting at λ . ① When Condition (iii.a) of Definition 18 holds, by using the function h ofDefinition 18, Equation (4) can be rewritten as: ℓ i λ ψ j u n | {z } fσ ( x ,j )= fσ ( xℓ,j ) fσr ( xn,i )= fσr ( xu,i ) z }| { f σ ( x i,j ) = h ( x ℓ + b σ ) | {z } f σ ( x ,j ) + h ( x λ + a σr ) | {z } f σr ( x n,i ) − h ( x ℓ + b σ ) | {z } f σ ( x ,n ) (14)From (14), and by using the fact that a σ r = b σ , we obtain f σ ( x i,j ) = h ( x λ + b σ ) , which is true by Condition (iii.a) of Definition 18. ② When Condition (iii.b) of Definition 18 holds, Equation (4) can be provenin a similar way as in case ① .Hence, Equation (4) holds. ⊓⊔ Theorem 5.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i , h pre , out , out i , h out , out , suf i , h in , out , out i , h out , out , in i} . Equation (4) can be used toobtain f σ ( x i,j ) for a sequence x ,n wrt a window [ i, j ] whose type is in S , as-suming that feature f has the same value property. roof. Because of Remark 1, we only consider the representatives h pre , fac , suf i and h pre , out , out i . • [ h pre , fac , suf i ] Since f has the same value property, Equation (4) can berewritten as: f σ ( x i,j ) = f σ ( x ,n ) | {z } f σ ( x ,j ) + f σ ( x ,n ) | {z } f σr ( x n,i ) − f σ ( x ,n ) | {z } f σ ( x ,n ) (15)From (15) we obtain f σ ( x i,j ) = f σ ( x ,n ) , which is true by definition of the same value property. Hence, Equation (4) holds. • [ h pre , out , out i ] Since f has the same value property, Equation (4) canbe rewritten as: f σ ( x i,j ) = f σ ( x ,n ) | {z } f σ ( x ,j ) + 0 |{z} f σr ( x n,i ) − f σ ( x ,n ) | {z } f σ ( x ,n ) (16)From (16) we obtain f σ ( x i,j ) = 0 , which is true by definition of the repre-sentative h pre , out , out i . Hence, Equation (4) holds. ⊓⊔ Theorem 6.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i , h pre , out , out i , h out , out , suf i , h in , out , out i , h out , out , in i} . Equation (4) can be used toobtain f σ ( x i,j ) for a sequence x ,n wrt window [ i, j ] whose type is in S , assum-ing that the pair f, σ has the single position inflexion property.Proof. When the pair f, σ has the single position inflexion property thefollowing identity holds: f σ ( x k,k ′ ) = f σ ( x ,n ) , ∀ k, k ′ | ≤ k ≤ k ′ ≤ n, and ∃ an extended σ -pattern occurrence in [ k, k ′ ] wrt x k,k ′ (17)Because of Remark 1 we only consider the representatives h pre , fac , suf i and h pre , out , out i . • [ h pre , fac , suf i ] By Identity (17) and because all three intervals [1 , j ] , [ i, j ] and [ i, n ] contain an extended σ -pattern occurrence, f σ ( x ,j ) = f σ r ( x n,i ) = f σ ( x i,j ) = f σ ( x ,n ) ; thus Equation (4) can be rewritten as: f σ ( x i,j ) = f σ ( x ,n ) | {z } f σ ( x ,j ) + f σ ( x ,n ) | {z } f σr ( x n,i ) − f σ ( x ,n ) | {z } f σ ( x ,n ) (18)From (18) we obtain f σ ( x i,j ) = f σ ( x ,n ) , which is true by definition of the single position property. Hence, Equation (4) holds.17 [ h pre , out , out i ] By Identity (17) f σ ( x ,j ) = f σ ( x ,n ) ; since there is noextended σ -pattern occurrence neither in [ i, j ] nor in [ i, n ] , f σ r ( x n,i ) = f σ ( x i,j ) = 0 , and Equation (4) can be rewritten as: f σ ( x i,j ) = f σ ( x ,n ) | {z } f σ ( x ,j ) + 0 |{z} f σr ( x n,i ) − f σ ( x ,n ) | {z } f σ ( x ,n ) (19)From (19) we obtain f σ ( x i,j ) = 0 , which is true by definition of the secondcomponent of the h pre , out , out i representative. Hence, Equation (4) holds. ⊓⊔ Sufficient Conditions for the Validity of Equations (5) and (6)
Theorem 7.
Consider a pattern σ whose class has a non-emptyintersection with the set of representatives S = {h pre , fac , suf i , h in , in , in i , h pre , out , out i , h out , out , suf i , h in , out , out i , h out , out , in i , h out , out , out i} . Equation (5) can be used to obtain f σ ( x i,j ) for a sequence x ,n wrt window [ i, j ] whose type is in S , assuming that feature f has the sumdecomposition and the positive properties. If, in addition to the set S , wealso have the representative h pre , out , suf i then Equation (5) can still be used,provided that pattern σ has the exclude-out-in property.Proof. Because of Remark 1 we only consider the representatives h pre , fac , suf i , h pre , out , out i and h out , out , out i . • [ h pre , fac , suf i ] Since, from Theorem 3, Equation (4) is valid for this rep-resentative when f has the sum decomposition property, and since f hasthe positive property, the right-hand side of Equation (5) is the maximumbetween zero and a positive value; therefore Equation (5) is also valid for h pre , fac , suf i . • [ h pre , out , out i ] Since f has the positive and sum decomposition prop-erties, f σ ( x ,n ) ≥ and f σ ( x ,j ) ≤ f σ ( x ,n ) . Due to the third component“ out ” of the representative f σ r ( x n,i ) = 0 . When Equation (4) is used, weobtain f σ ( x i,j ) ≤ ; but with Equation (5) we get f σ ( x i,j ) = 0 , which is truedue to the second component “ out ” of the representative. Hence (5) is valid. • [ h out , out , out i ] Since f has the positive property, f σ ( x ,n ) ≥ . Dueto the first and third “ out ” components of the representative, f σ ( x ,j ) = f σ r ( x n,i ) = 0 . When Equation (4) is used, we obtain f σ ( x i,j ) ≤ ; but withEquation (5) we get f σ ( x i,j ) = 0 , which is true due to the second component“ out ” of the representative. Hence, Equation (5) is valid. • [ h pre , out , suf i ]– From the convexity property, the signature of the words x ℓ,j and x i,u respectively contain at most one maximum word in L σ . Because of pre and suf , the signature of the words x ℓ,j and x i,u respectively contain atleast one word in L σ . Consequently, the signature of the words x ℓ,j and x i,u contain one single maximum word in L σ , respectively denoted by w pre and w suf . 18 Because of the out of h pre , out , suf i , the signature of the word x i,j does not contain any subword that belongs to L σ . In addition, sincethe pattern σ has the exclude-out-in property we have that w pre ends before position i , and w suf starts after position j , i.e. w pre and w suf do not overlap. Consequently, since in addition f has the positive and the sum decomposition properties, f σ ( x ,n ) ≥ and f σ ( x ,j ) + f σ r ( x n,i ) ≤ f σ ( x ,n ) . When Equation (4) is used, we obtain f σ ( x i,j ) ≤ ;but with Equation (5) we get f σ ( x i,j ) = 0 , which is true due to the secondcomponent “ out ” of the representative. Hence, Equation (5) is valid. ⊓⊔ Theorem 8.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i , h pre , out , out i , h out , out , suf i , h in , out , out i , h out , out , in i , h out , out , out i} . Equation (5) can be used to obtain f σ ( x i,j ) for a sequence x ,n wrt window [ i, j ] whose type isin S , assuming that feature f has the same value and the positive properties.Proof. Because of Remark 1 we only consider the representatives h pre , fac , suf i , h pre , out , out i and h out , out , out i . • [ h pre , fac , suf i ] Since, from Theorem 5, Equation (4) is valid for this repre-sentative when f has the same value property, and since f has the positive property, Equation (5) is the maximum between zero and a positive value;therefore Equation (5) is also valid for h pre , fac , suf i . • [ h pre , out , out i ] Since f has the same value property, f σ ( x ,j ) = f σ ( x ,n ) .Due to the third component “ out ” of the h pre , out , out i representa-tive f σ r ( x n,i ) = 0 . Consequently, by Equation (5), we have f σ ( x i,j ) =max(0 , f σ ( x ,j ) + f σ r ( x n,i ) − f σ ( x ,n )) = 0 , which is true due to the sec-ond component “ out ” of the h pre , out , out i representative. Hence, Equa-tion (5) is valid. • [ h out , out , out i ] Since f has the positive property, f σ ( x ,n ) ≥ . Dueto the first and third “ out ” components of the h out , out , out i represen-tative, f σ ( x ,j ) = f σ r ( x n,i ) = 0 . When Equation (4) is used, we obtain f σ ( x i,j ) ≤ ; but with Equation (5) we get f σ ( x i,j ) = 0 , which is true due tothe second component “ out ” of the h out , out , out i representative. Hence,Equation (5) is valid. ⊓⊔ Theorem 9.
Consider a pattern σ whose class has a non-empty intersectionwith the set of representatives S = {h pre , fac , suf i , h in , in , in i , h pre , out , suf i , h in , out , in i , h pre , out , out i , h in , out , out i , h out , out , suf i , h out , out , in i , h out , out , out i} . Equation (6) can be used to obtain f σ ( x i,j ) for a sequence x ,n wrt a window [ i, j ] whose type is in S if (a) either both h pre , fac , suf i and h in , in , in i are not representatives of the pattern σ , (b) or if one of the followingconditions holds:i) f has the sum decomposition property.ii) f has the same value property.iii) the pair f, σ has the single position no-inflexion or the single posi-tion inflexion properties. roof. [CASE 1] Consider the representatives that have an extended σ -patternoccurrence in [ i, j ] . In this case the only two representatives are h pre , fac , suf i and h in , in , in i . Since Equation (4) is valid for these representatives when: i) f has the sum decomposition property (see Theorem 3), ii) f has the same value property (see Theorem 5), iii) the pair f, σ has the single position no-inflexion property (see Theo-rem 4) or the single position inflexion property (see Theorem 6),Equation (6) is also valid.[CASE 2] Consider the representatives that do not have an extended σ -patternoccurrence in [ i, j ] . When using Equation (6), because of the check “if no σ -patternin x i,j ” in Equation (6), the value of f σ ( x i,j ) is zero. Hence, Equation (6) is valid. ⊓⊔ Synthesis
The classification induced by theorems 3 to 9 is presented in Table 4:for each pattern class corresponding to the same set of representative triples (seeFigure 3) we select one pattern (see the columns of Table 4, e.g.
Plain ) andprovide for each feature property (see the rows of Table 4, e.g. SV ) and for eachfeature/pattern property (see the cells of Table 4, e.g. SPN ) the theorem provingthat an Equation is valid under such properties. Note that any missing Equationis due to a counterexample given in Appendix B, and not to the fact that weare missing a theorem. Coloured grey cells indicate a non-existing time-seriesconstraint in the time-series catalogue [3].Equation (6) can be used to compute the value of f σ ( x i,j ) , for all reversibleand convex patterns without the single letter property from [3], except for Zigzag with the max and min features (see the cells marked with “none” inTable 4), as
Zigzag uses the representative triple h in , in , in i without having the single position inflexion or the single position no-inflexion properties. Since checking each window of m consecutive positions of a sequence of size n independently gives a time complexity of O ( m · n ) , we now introduce a theoremleading to an optimal time complexity. Theorem 10.
The time complexity of evaluating Equations (4) and (5) on asequence X = x x . . . x n for all sliding windows of size m is Θ ( n ) . Moreover, as-suming one can check in constant time whether a sliding window of the sequence X contains or not a σ -pattern, the time complexity of evaluating Equation (6)for all sliding windows of size m of sequence X is also Θ ( n ) .Proof. Evaluating (4), (5) and (6) for all sliding windows [ i, j ] (with i ∈ [1 , n − m +1] and j = i + m − ) requires evaluating f σ ( x ,i + m − ) , f σ ( x i,n ) and f σ ( x ,n ) . – First, note that within Equation (6) all the tests “if no σ -pattern in x i,j ” onthe different sliding windows (with i ∈ [1 , n − m + 1] and j = i + m − ) canbe done in O ( n ) because of our assumption.20 p , o , s i h p , f , s i h o , o , o i h i , o , o i h o , o , i i h i , o , i i h i , i , i ih p , f , s i h p , o , o i h o , o , s i h o , o , o i h i , i , i ih p , f , s i h p , o , o i h o , o , s i h o , o , o i h i , o , o i h o , o , i i representative triples f e a t u r ep r o pe r t i e s S V S D S PP N,E O O O N,E pattern properties f \ σ SPNSPN SPO SPO SPO SPN
DecSeq Gorge Valley Plain Zigzag SteadySeqone width surf max min
Table 4: Indicates, for existing combinations of feature f and pattern σ fromthe time-series catalogue, which of the Equations (4), (5) and (6) are valid, aswell as the corresponding justifying theorem, where: ( i ) within a representativetriple we use as a shortcut the first letter of each component; ( ii ) p , sp , sd and sv resp. indicate whether the feature f has the positive , the single position ,the sum decomposition or the same value property; ( iii ) n , e , and o resp.indicate whether the pattern σ has the no-inflexion , the exclude-out-in or the one-inflexion property; ( iv ) spn , spo resp. indicate whether the pair f, σ has the single position no-inflexion property or the single positioninflexion property. – Second, evaluating f σ ( x ,i + m − ) for all i ∈ [1 , n − m + 1] as well as f σ ( x ,n ) can be done in O ( n ) by using a register automaton [6] for sum_ f _ σ ( r,x x . . . x n ) , which exposes all its intermediate register values [9]. – Third, since the pattern σ is reversible and since the feature f is commu-tative, f σ ( x i,n ) = f σ r ( x n,i ) . Evaluating f σ r ( x n,i ) for all i ∈ [1 , n − m + 1] can also be done in O ( n ) by using a register automaton for sum_ f _ σ r ( r,x n x n − . . . x ) , which exposes all its intermediate register values.Therefore, the time complexity of evaluating Equations (4), (5) and (6) is O ( n ) .Since each variable of x ,n needs to be scanned at least once to identify patternoccurrences, this time complexity is optimum. ⊓⊔ Pattern Properties for Checking in Linear Time the Occurrence ofPattern in Sliding Windows
We now introduce some additional patternproperties to check in time O ( n ) whether or not the different sliding windowsof size m of a sequence X = x x . . . x n contain a pattern occurrence. As theseproperties cover all reversible patterns of the time-series catalogue, one can alsouse Equation (6) for such patterns for the entries of Table 4 mentioning (6).21 efinition 20. A pattern σ has the letter property wrt a letter e if e is aword in L σ , and if any word of L σ contains at least one occurrence of e , i.e. if L σ ∩ { e } 6 = ∅ and if L σ ∩ ( Σ \ e ) ∗ = ∅ . Definition 21.
A pattern σ has the suffix-unavoidable property wrt a letter e ∈ { ‘<’ , ‘=’ , ‘>’ } if all words in L σ contain at least one occurrence of e , and ifeach suffix starting with the letter e of any word of L σ belongs also to L σ , i.e. if L σ ∩ ( Σ \ e ) ∗ = ∅ and if shuffle ( L σ , s ) ∧ Σ ∗ s e Σ ∗ ∧ Σ ∗ s ( Σ ∗ \ L σ ) = ∅ . Definition 22.
A pattern σ has the incompressible property if all properfactors of any word in L σ do not belong to L σ , i.e. if Σ + L σ Σ ∗ ∩ L σ = ∅ and if Σ ∗ L σ Σ + ∩ L σ = ∅ . Definition 23.
A pattern σ has the factor property if for any word w in L σ allfactors of w , whose length is greater than or equal to the smallest length ω σ of aword in L σ , belong also to L σ , i.e. if shuffle ( shuffle ( L σ , s ) , s ) ∧ Σ ∗ sΣ ∗ Σ ω σ Σ ∗ sΣ ∗ ∧ Σ ∗ s ( Σ ∗ \ L σ ) sΣ ∗ = ∅ .Example 9 (pattern properties, continuation of Example 4). – Eight out of the reversible patterns of [3] have the letter property.For instance, the patterns Dec , DecSeq and
StrictlyDecSeq all have the letter property wrt { ‘>’ } since ( i ) the word ‘>’ is in L Dec , in L DecSeq andin L StrictlyDecSeq , and ( ii ) any word in L Dec , in L DecSeq or in L StrictlyDecSeq contains at least one occurrence of ‘>’. – out of the reversible patterns of [3] have the suffix-unavoidable property. For instance, the pattern Peak has the suffix-unavoidable prop-erty wrt the letter ‘<’, since ( i ) any occurrence of peak contains at least oneoccurrence of ‘<’, and since ( ii ) any suffix, starting with a ‘<’, of a word of L Peak is also a peak. – Six out of the reversible patterns of [3] have the incompressible prop-erty. The pattern DecTerrace has the incompressible property because, ifan occurrence of the letter ‘>’ is removed from any word in L DecTerrace , thecorresponding proper factor is not in L DecTerrace . – Seven out of the reversible patterns of [3] have the factor property. Forinstance, the pattern Zigzag has the factor property because any factorof length greater than or equal to ω Zigzag = 3 of a zigzag is also a zigzag.For each pattern property described in Definitions 20 to 23 we now show howto check in O ( n ) which sliding windows are empty or not. – Consider a pattern σ that has the letter property wrt a letter e . Firstcompute in one scan the number of occurrences nocc [ k ] of e in x ,k for all k ∈ [1 , n ] ; second, for each sliding window [ i, j ] , check in constant time that nocc [ i ] = nocc [ j ] . – Consider a pattern σ that has the suffix-unavoidable property wrt a let-ter e . First compute in one scan the number of occurrences nocc1 [ k ] of e in22 ,k for all k ∈ [1 , n ] ; second compute in one scan the number of maximal oc-currences nocc2 [ k ] of pattern σ in x ,k for all k ∈ [1 , n ] ; third, for each slidingwindow [ i, j ] , check in constant time that nocc1 [ i ] = nocc1 [ j ] ∨ nocc2 [ i ] = nocc2 [ j ] . – Consider a pattern σ that has the incompressible or the factor property.First compute for each k = 1 , , . . . , n the end end [ k ] of the next patternoccurrence (which will be set to n + 1 if no pattern occurrence ends after k , e.g. end [ n ] = n + 1 ). Second compute for each k = n, n − , . . . , thestart start [ k ] of the previous pattern occurrence (which will be set to if nopattern occurrence starts before k , e.g. start [1] = 0 ). Third, depending onwhether the pattern has the incompressible or the factor property, dothe following check in constant time for each sliding window [ i, j ] : • [ incompressible ] return end [ i ] > j ∨ start [ j ] < i • [ factor ] endi = end [ i ] , startj = start [ j ] if endi > n ∨ startj < then return trueif endi − i ≥ ω σ then i ′ = i else i ′ = endi if j − startj ≥ ω σ then j ′ = j else j ′ = startjendi ′ = end [ i ′ ] , startj ′ = start [ j ′ ] if endi ′ > n ∨ startj ′ < then return truereturn min( j ′ , endi ′ ) − max( i ′ , start [min( j ′ , endi ′ )]) < ω σ Computing the end (resp. start) of the next (resp. previous) pattern occur-rence is done by using a register automaton derived from the transducer [6]which recognises pattern occurrences. Figure 5 give the register automatonassociated with the
Plain and the
Zigzag patterns. In (A), the dotted tran-sition marks the end of a plain. In (B), the dashed (resp. dotted) transitionsindicate that we are inside a zigzag (resp. that a zigzag is ending). Dependingwhether we were in a zigzag or not we set end [ n − to n or to n + 1 . Example 10 (Running automata that compute the end of the next patternoccurrence).
Table 5 (resp. Table 6) shows an example of execution of theregister automaton given in Part (A) (resp. (B)) of Figure 5.
Rather than stating a time-series constraint on each window of size m , whichwould result in an O ( m · n ) space complexity, we now show how to reformulatethe slide_sum_ f _ σ ( m, low , up , x x . . . x n ) constraint as a conjunction ofconstraints with a space complexity of Θ ( n ) . This reformulation was ex-tended to the patterns of Table 1 for Equation (6) to reformulate condition“if no σ -pattern in x i,j ”, but is not described here for space reasons. Theorem 11.
For those time-series constraints for which Equations (4)or (5) holds, the constraint slide_sum_ f _ σ ( m, low , up , x x . . . x n ) canbe reformulated with a space complexity of Θ ( n ) . How to generate a transducer that recognises all maximal pattern occurrences wasdescribed in [17]. roof. For Equation (4), it can be reformulated as the conjunction sum_ f _ σ ( r, x x . . . x n , −→ r −→ r . . . −→ r n ) ∧ sum_ f _ σ ( r, x n x n − . . . x , ←− r ←− r . . . ←− r n ) ∧∀ i ∈ [1 , n − m + 1] : r i,j = −→ r j + ←− r i − r (with j = i + m − ) ∧ low = min( r ,m r ,m +1 . . . r n − m +1 ,n ) ∧ up = max( r ,m r ,m +1 . . . r n − m +1 ,n ) (20)where ←− r i (resp. −→ r j ) is the exposed register value corresponding to the first ar-gument of sum_ f _ σ ( −→ r j , x x . . . x j ) , (resp. sum_ f _ σ r ( ←− r i , x n x n − . . . x i ) ).For Equation (5), we replace in (20) the term r i,j = −→ r j + ←− r i − r by the term r i,j = max(0 , −→ r j + ←− r i − r ) . ⊓⊔ s r< = > > = < (A) transitions: ◦ ∈ { <, = , > } x k ◦ x k +1 end [ k ] = end [ k + 1] x k ◦ x k +1 end [ k ] = k + 1 s a b cd e f ss = >< > = < > = < > = <> = < > = < > = < (B) transitions: end [ n −
1] = n + 1 − in [ n − ◦ ∈ { <, = , > } x k ◦ x k +1 end [ k −
1] = end [ k ] , in [ k ] = 0 x k ◦ x k +1 end [ k −
1] = end [ k ] , in [ k ] = 1 x k ◦ x k +1 end [ k −
1] = k, in [ k ] = 0 Fig. 5: Register automata computing the end of the next pattern maximal oc-currence for (A) the
Plain and (B) the
Zigzag patterns x k s k < > < > = < > < < > 1] 5 5 5 5 5 9 9 9 9 12 12 end [ k ] 5 5 5 5 9 9 9 9 12 12 12 in [ k ] 0 0 1 1 0 0 0 1 0 0 1 (B1) x k s k > < > > < > = < > < >x k +1 k 12 11 10 9 8 7 6 5 4 3 2 end [ k − 1] 9 9 9 9 6 6 6 1 1 1 1 end [ k ] 9 9 9 6 6 6 1 1 1 1 1 in [ k ] 0 0 1 0 0 1 0 0 0 1 1 (B2) Table 6: Running the register automaton of Figure 5 that computes the end ofthe next zigzag on (B1) the sequence x = 010100101201 and (B2) on its reverse Based on a detailed analysis of feature and pattern properties of time-series con-straints of the time-series catalogue that use the Sum aggregator, we came upwith a Θ ( n ) time complexity checker, and a Θ ( n ) space complexity reformula-tion for such constraints. It is an open question how to generalise our results toother aggregators such as min or max . Unlike the sum aggregator, the equality g ( a, x ) = b where a, b are fixed integers and x is a variable does not uniquelydetermine x when g ∈ { min , max } . Acknowledgment We thank Pierre Flener for some feedback on an early versionof this paper, and Colin de la Higuera for discussions on regular expressions, on theproperties of their languages and on operators such as shuffle. References 1. Alur, R., Fisman, D., Raghothaman, M.: Regular programming for quantitativeproperties of data streams. In: Thiemann, P. (ed.) Programming Languages andSystems - 25th European Symposium on Programming, ESOP 2016, Held as Partof the European Joint Conferences on Theory and Practice of Software, ETAPS2016, Eindhoven, The Netherlands, April 2-8, 2016, Proceedings. Lecture Notes inComputer Science, vol. 9632, pp. 15–40. Springer (2016)2. Arafailova, E., Beldiceanu, N., Douence, R., Carlsson, M., Flener, P., Rodríguez,M.A.F., Pearson, J., Simonis, H.: Global constraint catalog, volume ii, time-seriesconstraints. CoRR abs/1609.08925 (2016), http://arxiv.org/abs/1609.08925 3. Arafailova, E., Beldiceanu, N., Douence, R., Carlsson, M., Flener, P., Rodríguez,M.A.F., Pearson, J., Simonis, H.: Global constraint catalog, volume II, time-seriesconstraints. arXiv preprint arXiv:1609.08925 (2016)4. Beldiceanu, N., Contejean, E.: Introducing Global Constraints in CHIP. Mathl.Comput. Modelling 20(12), 97–123 (1994)5. Beldiceanu, N., Carlsson, M.: Revisiting the cardinality operator and introducingthe cardinality-path constraint family. In: Codognet, P. (ed.) ICLP 2001. LNCS,vol. 2237, pp. 59–73. Springer (2001) . Beldiceanu, N., Carlsson, M., Douence, R., Simonis, H.: Using finite transduc-ers for describing and synthesising structural time-series constraints. Constraints21(1), 22–40 (January 2016), journal fast track of CP 2015: summary on p. 723 ofLNCS 9255, Springer, 20157. Beldiceanu, N., Carlsson, M., Petit, T.: Deriving filtering algorithms from con-straint checkers. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 107–122.Springer (2004)8. Bessière, C., Hebrard, E., Hnich, B., Kiziltan, Z., Walsh, T.: SLIDE: A usefulspecial case of the CARDPATH constraint. In: Ghallab, M., et al. (eds.) ECAI 2008.pp. 475–479. IOS Press (2008)9. Carlsson, M., al.: SICStus Prolog User’s Manual. RISE SICS AB, 4.5.1 edn. (April2019)10. Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory,Languages, and Computation. Addison-Wesley, 3rd edn. (2007)11. Lallouet, A., Law, Y.C., Lee, J.H., Siu, C.F.K.: Constraint programming on infinitedata streams. In: Walsh, T. (ed.) IJCAI 2011, Proceedings of the 22nd InternationalJoint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22,2011. pp. 597–604. IJCAI/AAAI (2011)12. Lee, J.C.H., Lee, J.H.M., Zhong, A.Z.: Augmenting stream constraint program-ming with eventuality conditions. In: Hooker, J.N. (ed.) Principles and Practice ofConstraint Programming - 24th International Conference, CP 2018, Lille, France,August 27-31, 2018, Proceedings. Lecture Notes in Computer Science, vol. 11008,pp. 242–258. Springer (2018)13. Maher, M.J., Narodytska, N., Quimper, C.G., Walsh, T.: Flow-Based Propagatorsfor the sequence and Related Global Constraints. In: Stuckey, P.J. (ed.) Principlesand Practice of Constraint Programming (CP’2008). LNCS, vol. 5202, pp. 159–174.Springer-Verlag (2008)14. Pesant, G.: A regular language membership constraint for finite sequences of vari-ables. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 482–495. Springer(2004)15. Picard-Cantin, É., Bouchard, M., Quimper, C., Sweeney, J.: Learning Parametersfor the Sequence Constraint from Solutions. In: Rueher, M. (ed.) Principles andPractice of Constraint Programming (CP’2016). LNCS, vol. 9892, pp. 405–420.Springer-Verlag (2016)16. Régin, J.C., Puget, J.F.: A Filtering Algorithm for Global Sequencing Constraints.In: Smolka, G. (ed.) Principles and Practice of Constraint Programming (CP’97).LNCS, vol. 1330, pp. 32–46. Springer-Verlag (1997)17. Rodríguez, M.A.F., Flener, P., Pearson, J.: Automatic generation of descriptionsof time-series constraints. In: 29th IEEE International Conference on Tools withArtificial Intelligence, ICTAI 2017, Boston, MA, USA, November 6-8, 2017. pp.102–109. IEEE Computer Society (2017)18. Vaandrager, F.: Model learning. Communications of the ACM 60(2), 86–95 (Febru-ary 2017) List of Feasible Types with Corresponding Witnesses Table 7 provides for each of the feasible type h t , t , t i that oc-curs in the map shown by Fig-ure 1, a regular expression for whichthe language defined by Theorem 2is not empty. For instance, thetype h suf , fac , pre i can be obtained from the following regular expression‘<<<=<<<|<<=<|<=<<|=’ as il-lustrated by the figure below. word in ..k suffix in ..j factor in i..j prefix in i..k <<< = <<<<< = < = < = << i j k Triple Witness Triple Witness h fac , fac , fac i << = << | = h out , out , pre i << = < | < = h fac , fac , in i < = << | = h out , out , suf i << = | = h fac , fac , pre i << = <<< | < = << | = << | = h pre , fac , fac i < = < == | < h fac , fac , suf i << = < = | = h pre , fac , in i < = < = < | = < = < | < h fac , out , fac i < = < = < | = h pre , fac , pre i << = <<< | < = << | << = | = h fac , out , in i << = < = | < = h pre , fac , suf i < = < = < | < h fac , out , out i < = << | = h pre , out , fac i < = < = | < h fac , out , pre i < = <<> | << | = h pre , out , in i < = < = | < = h fac , out , suf i < = < = | = h pre , out , out i < == | < h fac , pre , in i < = < = | = h pre , out , pre i < = < = < | < = h fac , pre , pre i < = << | = h pre , out , suf i < = < | < h in , fac , fac i << = < | = h pre , pre , in i < = < | < h in , fac , in i << = << | << = < | < = << | = h pre , pre , pre i << == | < h in , fac , pre i << = <<< | << = < | < = << | = h suf , fac , fac i <<< = << | << = < | << = | = h in , fac , suf i << = < = | << = < | = h suf , fac , in i <<< = << | << = < | < = << | = h in , in , in i << | < h suf , fac , pre i <<< = <<< | << = < | < = << | = h in , in , pre i << = | < h suf , fac , suf i <<< = << | << = < | = << | = h in , out , fac i < = < = < | < = h suf , in , in i < == | = h in , out , in i <<< | << h suf , in , pre i < = < | = h in , out , out i < = | < h suf , out , fac i << = >< | < = | > h in , out , pre i <<< = | << h suf , out , in i < === | == h in , out , suf i < = < = | < = h suf , out , out i << = < | < = h in , pre , in i << = < | << = | < h suf , out , pre i < === < | == h in , pre , pre i << = | < h suf , out , suf i << = < = | < = h in , suf , fac i < = < = | < h suf , pre , in i << = < = | < = < | = h in , suf , in i < = << | = << | < h suf , pre , pre i << = << | < = < | = h in , suf , pre i < = << = | = << | < h suf , suf , fac i << = < | = h in , suf , suf i << = | = h suf , suf , in i < = < | = h out , out , fac i << = < | = h suf , suf , pre i << = << | < = < | = h out , out , in i < = | = h suf , suf , suf i << == | = h out , out , out i <<<< | <<< Table 7: List of feasible types and associated regular expressions witnesses27 Counterexamples for Equations (4) , (5) and (6) For each time-series constraint of the time-series constraint catalogue this ap-pendix provides small time series corresponding to counterexamples of the va-lidity of Equations (4), (5) and (6) for all equations missing in Table 4. Forinstance, for nb_decreasing_sequence and Equation (4), we get the fol-lowing counterexample: consider the three windows of size wrt the sequence h , , , − i ; using Equation (4) returns h , , i rather than the expected values h , , i , i.e. on the second subsequence “ , ”, (4) returns − ratherthan the expected value ; Value reflects the fact that subsequence “ , ” doesnot contain any decreasing sequence. constraint (4) (5) (6) nb_decreasing_sequence , h , , , − i , h , , i , h , , i , h , , , − i , h , , i , h , , i - sum_width_decreasing_sequence , h , , , , − i , h , , , i , h , − , − , i - - sum_surf_decreasing_sequence , h− , − , − , − , − , i , h− , , , − , i , h− , , , − , i , h− , , − i , h , − i , h , i - sum_max_decreasing_sequence , h , − , − , − i , h , , − i , h , − , − i , h− , − , i , h− , i , h , i - sum_min_decreasing_sequence , h , − , − , − i , h− , , − i , h− , − , − i , h− , , − i , h , − i , h , i - nb_decreasing_terrace , h , , , − i , h , , i , h , − , i - - sum_width_decreasing_terrace , h , , , − i , h , , i , h , − , i - - sum_surf_decreasing_terrace , h , − , − , − i , h , , i , h , , i , h , − , − , − i , h , , i , h , , i - sum_height_decreasing_terrace , h , − , − , − i , h , , i , h , , i , h , − , − , − i , h , , i , h , , i - nb_gorge - - - sum_width_gorge , h , − , , i , h , , i , h , − , i - - sum_surf_gorge , h− , − , , i , h , , i , h , − , i , h , − , i , h− i , h i - sum_min_gorge - , h , − , i , h− i , h i - nb_increasing_sequence , h− , , , i , h , , i , h , , i , h− , , , i , h , , i , h , , i - sum_width_increasing_sequence , h− , , , , i , h , , , i , h , − , − , i - - sum_surf_increasing_sequence , h− , − , − , − , i , h− , , , i , h− , , , i , h− , − , i , h , − i , h , i - sum_max_increasing_sequence , h− , − , − , i , h− , , i , h− , − , i , h− , − , i , h− , i , h , i - sum_min_increasing_sequence , h− , − , − , i , h− , , − i , h− , − , − i , h− , − , i , h , − i , h , i - nb_increasing_terrace , h− , , , i , h , , i , h , − , i - - sum_width_increasing_terrace , h− , , , i , h , , i , h , − , i - - sum_surf_increasing_terrace , h− , − , − , i , h , , i , h , , i , h− , − , − , i , h , , i , h , , i - sum_height_increasing_terrace , h− , − , − , i , h , , i , h , , i , h− , − , − , i , h , , i , h , , i - nb_peak , h− , , , i , h , , i , h , − , i - - sum_width_peak , h− , , , i , h , , i , h , − , i - - sum_surf_peak , h− , − , , − i , h , , i , h , , i , h− , − , − , i , h− , i , h , i - sum_max_peak , h− , , , i , h , , i , h , − , i , h− , − , − , i , h− , i , h , i - onstraint (4) (5) (6) nb_plain , h , − , − , i , h , , i , h , − , i - - sum_width_plain , h , − , − , i , h , , i , h , − , i - - sum_surf_plain , h , − , − , i , h , , i , h , , i , h , − , i , h− i , h i - sum_height_plain , h , − , − , i , h , , i , h , , i , h , − , i , h− i , h i - nb_plateau , h− , , , i , h , , i , h , − , i - - sum_width_plateau , h− , , , i , h , , i , h , − , i - - sum_surf_plateau , h− , , , i , h , , i , h , − , i , h− , − , − , i , h− , i , h , i - sum_height_plateau , h− , , , i , h , , i , h , − , i , h− , − , − , i , h− , i , h , i - nb_proper_plain , h , − , − , i , h , , i , h , − , i - - sum_width_proper_plain , h , − , − , i , h , , i , h , − , i - - sum_surf_proper_plain , h , − , − , i , h , , i , h , , i , h , − , − , i , h , , i , h , , i - sum_height_proper_plain , h , − , − , i , h , , i , h , , i , h , − , − , i , h , , i , h , , i - nb_proper_plateau , h− , , , i , h , , i , h , − , i - - sum_width_proper_plateau , h− , , , i , h , , i , h , − , i - - sum_surf_proper_plateau , h− , , , i , h , , i , h , − , i , h− , − , − , − , i , h , , , i , h , , , i - sum_height_proper_plateau , h− , , , i , h , , i , h , − , i , h− , − , − , − , i , h , , , i , h , , , i - nb_steady_sequence - - - sum_width_steady_sequence - - - sum_surf_steady_sequence - , h− , − , i , h− , i , h , i - sum_height_steady_sequence - , h− , − , i , h− , i , h , i - nb_strictly_decreasing_sequence - - - sum_width_strictly_decreasing_sequence - - - sum_surf_strictly_decreasing_sequence - , h− , , − i , h , − i , h , i - sum_max_strictly_decreasing_sequence - , h− , − , i , h− , i , h , i - sum_min_strictly_decreasing_sequence - , h− , , − i , h , − i , h , i - nb_strictly_increasing_sequence - - - sum_width_strictly_increasing_sequence - - - sum_surf_strictly_increasing_sequence - , h− , − , i , h , − i , h , i - sum_max_strictly_increasing_sequence - , h− , − , i , h− , i , h , i - sum_min_strictly_increasing_sequence - , h− , − , i , h , − i , h , i - nb_summit - - - sum_width_summit , h− , , , i , h , , i , h , − , i - - sum_surf_summit , h− , − , , − i , h , , i , h , , i , h− , − , − , i , h− , i , h , i - sum_max_summit - , h− , − , − , i , h− , i , h , i - nb_valley , h , − , − , i , h , , i , h , − , i - - sum_width_valley , h , − , , i , h , , i , h , − , i - - sum_surf_valley , h− , − , , i , h , , i , h , − , i , h , − , i , h− i , h i - sum_min_valley , h , − , − , i , h , , i , h , , i , h , − , i , h− i , h i - nb_zigzag , h− , , − , i , h , , i , h , − , i , h− , , − , , − i , h , , i , h , , i - sum_width_zigzag , h− , , − , i , h , , i , h , − , i , h− , , − , , − i , h , , i , h , , i - sum_surf_zigzag , h− , , − , i , h , , i , h , , i , h− , , − , i , h , , i , h , , i - sum_max_zigzag , h− , − , − , i , h , , i , h , , i , h− , − , − , i , h , , i , h , , i , h− , , − , , − , , − i , h , , , i , h , , , i sum_min_zigzag , h− , , − , i , h , , i , h , , i , h− , , − , i , h , , i , h , , i , h , − , , − , , − , i , h− , − , − , − i , h− , − , − , − i Evaluating Pattern Properties This appendix provides the program that computes all the representatives ofa pattern and the program that evaluates the properties of a pattern. Bothprograms (i) convert regular expression formulas of this paper in a sequence ofoperations on finite automata, and (ii) check that the final automaton containsor not an accepting state. % Purpose: Compute the set of representative triple of a pattern and the properties of a pattern% Author: Nicolas Beldiceanu, IMT Atlantique:- use_module(dfa_aux_appendixC).% generate all types used to generate the representative triples (see Figure 3) of the reversible% patterns of Table 1 who have the single letter property% | ? top.% decreasing_terrace-[[out,out,out],[out,out,in],[in,out,out]]% increasing_terrace-[[out,out,out],[out,out,in],[in,out,out]]% plain-[[out,out,out],[out,out,in],[in,out,out]]% plateau-[[out,out,out],[out,out,in],[in,out,out]]% proper_plain-[[out,out,out],[out,out,in],[in,out,out]]% proper_plateau-[[out,out,out],[out,out,in],[in,out,out]]% gorge-[[out,out,suf],[out,out,in],[pre,out,out],[pre,fac,suf],% [pre,pre,in],[in,out,out],[in,suf,suf],[in,in,in]]% summit-[[out,out,suf],[out,out,in],[pre,out,out],[pre,fac,suf],% [pre,pre,in],[in,out,out],[in,suf,suf],[in,in,in]]% peak-[[out,out,out],[out,out,suf],[out,out,in],[pre,out,out],% [pre,fac,suf],[pre,pre,in],[in,out,out],[in,suf,suf],[in,in,in]]% valley-[[out,out,out],[out,out,suf],[out,out,in],[pre,out,out],% [pre,fac,suf],[pre,pre,in],[in,out,out],[in,suf,suf],[in,in,in]]% decreasing_sequence-[[pre,out,suf],[pre,fac,suf],[pre,pre,in],[in,suf,suf],[in,in,in]]% increasing_sequence-[[pre,out,suf],[pre,fac,suf],[pre,pre,in],[in,suf,suf],[in,in,in]]% steady_sequence-[[in,in,in]]% strictly_decreasing_sequence-[[in,in,in]]% strictly_increasing_sequence-[[in,in,in]]% zigzag-[[out,out,out],[out,out,in],[in,out,out],[in,out,in],[in,in,in]]top :- member(Pattern, [decreasing_terrace,increasing_terrace,plain,plateau,proper_plain,proper_plateau,gorge,summit,peak,valley,decreasing_sequence,increasing_sequence,steady_sequence,strictly_decreasing_sequence,strictly_increasing_sequence,zigzag]),reg_exp(Pattern, LPattern),findall(Triple, gen_potential_word_types(LPattern, Triple), Triples),write(Pattern-Triples), nl, fail.gen_potential_word_types(LPattern, Triple) :- % DEFINITION 9Triple = [T1, T2, T3],PotentialLanguages = [out, fac, pre, suf, in],member(T1, PotentialLanguages),member(T2, PotentialLanguages),member(T3, PotentialLanguages),word_language(T1, LPattern, L1),word_language(T2, LPattern, L2),word_language(T3, LPattern, L3), ord_type_language(L1, L2, L3, LPattern, LResult),regex_kernel(LResult, Automaton),(Automaton = kernel([],[]) -> fail ; true).% language of a wordword_language(out, LPattern, Out) :- % DEFINITION 9LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),Out = (SigmaPlus \ (SigmaStar + LPattern + SigmaStar)).word_language(fac, LPattern, Fac) :- % DEFINITION 9LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),Fac = (SigmaPlus + LPattern + SigmaPlus) /\(SigmaStar\(LPattern + SigmaPlus)) /\(SigmaStar\(SigmaPlus + LPattern)) /\(SigmaStar\LPattern).word_language(pre, LPattern, Pre) :- % DEFINITION 9LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),Pre = (LPattern + SigmaPlus) /\(SigmaStar\(SigmaPlus + LPattern)) /\(SigmaStar\LPattern).word_language(suf, LPattern, Suf) :- % DEFINITION 9LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),Suf = (SigmaPlus + LPattern) /\(SigmaStar\(LPattern + SigmaPlus)) /\(SigmaStar\LPattern).word_language(in, LPattern, In) :- % DEFINITION 9LEG = {[l],[e],[g]},SigmaStar = *(LEG),In = (LPattern + SigmaStar) /\(SigmaStar + LPattern).word_type_language(L1, L2, L3, LPattern, LResult) :- % DEFINITION 10 and THEOREM 2LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),Tempo1 = shuffle(shuffle(LPattern,s),s),Tempo2 = (shuffle(L1,s) + [s] + SigmaStar),Tempo3 = ((SigmaStar + [s] + L2 + [s] + SigmaPlus) \/ (SigmaPlus + [s] + L2 + [s] + SigmaStar)),Tempo4 = (SigmaStar + [s] + shuffle(L3,s)),LResult = (Tempo1 /\ Tempo2 /\ Tempo3 /\ Tempo4).% check pattern properties shown in Table 1 and in Examples 4 and 9% | ?- try(convex).% convex(bump_on_decreasing_sequence)% convex(decreasing)% convex(decreasing_sequence)% convex(decreasing_terrace)% convex(dip_on_increasing_sequence)% convex(gorge)% convex(increasing)% convex(increasing_sequence)% convex(increasing_terrace)% convex(inflexion)% convex(peak)% convex(plain)% convex(plateau)% convex(proper_plain)% convex(proper_plateau)% convex(steady)% convex(steady_sequence)% convex(strictly_decreasing_sequence) convex(strictly_increasing_sequence)% convex(summit)% convex(valley)% convex(zigzag)try(convex) :- % DEFINITION 3reg_exp(Pattern, LPattern),(convex(LPattern) -> write(convex(Pattern)), nl ; true),fail.% | ?- try(no_inflexion).% no_inflexion(decreasing)% no_inflexion(decreasing_sequence)% no_inflexion(decreasing_terrace)% no_inflexion(increasing)% no_inflexion(increasing_sequence)% no_inflexion(increasing_terrace)% no_inflexion(steady)% no_inflexion(steady_sequence)% no_inflexion(strictly_decreasing_sequence)% no_inflexion(strictly_increasing_sequence)try(no_inflexion) :- % DEFINITION 4reg_exp(Pattern, LPattern),(no_inflexion(LPattern) -> write(no_inflexion(Pattern)), nl ; true),fail.% | ?- try(one_inflexion).% one_inflexion(gorge)% one_inflexion(inflexion)% one_inflexion(peak)% one_inflexion(plain)% one_inflexion(plateau)% one_inflexion(proper_plain)% one_inflexion(proper_plateau)% one_inflexion(summit)% one_inflexion(valley)try(one_inflexion) :- % DEFINITION 5reg_exp(Pattern, LPattern),(one_inflexion(LPattern) -> write(one_inflexion(Pattern)), nl ; true),fail.% | ?- try(single_letter).% single_letter(decreasing)% single_letter(increasing)% single_letter(steady)try(single_letter) :- % DEFINITION 6reg_exp(Pattern, LPattern),(single_letter(LPattern) -> write(single_letter(Pattern)), nl ; true),fail.% | ?- try(exclude_out_in).% exclude_out_in(decreasing)% exclude_out_in(decreasing_sequence)% exclude_out_in(increasing)% exclude_out_in(increasing_sequence)% exclude_out_in(steady)% exclude_out_in(steady_sequence)% exclude_out_in(strictly_decreasing_sequence)% exclude_out_in(strictly_increasing_sequence)try(exclude_out_in) :- % DEFINITION 7reg_exp(Pattern, LPattern),(exclude_out_in(LPattern) -> write(exclude_out_in(Pattern)), nl ; true),fail.% | ?- try(letter).% letter(decreasing,g)% letter(decreasing_sequence,g)% letter(increasing,l)% letter(increasing_sequence,l)% letter(steady,e)% letter(steady_sequence,e)% letter(strictly_decreasing_sequence,g)% letter(strictly_increasing_sequence,l)try(letter) :- % DEFINITION 20 eg_exp(Pattern, LPattern),member(Letter, [l,e,g]),(letter(LPattern, Letter) -> write(letter(Pattern,Letter)), nl ; true),fail.% | ?- try(suffix_unavoidable).% suffix_unavoidable(decreasing,g)% suffix_unavoidable(decreasing_sequence,g)% suffix_unavoidable(gorge,g)% suffix_unavoidable(increasing,l)% suffix_unavoidable(increasing_sequence,l)% suffix_unavoidable(peak,l)% suffix_unavoidable(plain,g)% suffix_unavoidable(plateau,l)% suffix_unavoidable(proper_plain,g)% suffix_unavoidable(proper_plateau,l)% suffix_unavoidable(steady,e)% suffix_unavoidable(steady_sequence,e)% suffix_unavoidable(strictly_decreasing_sequence,g)% suffix_unavoidable(strictly_increasing_sequence,l)% suffix_unavoidable(summit,l)% suffix_unavoidable(valley,g)try(suffix_unavoidable) :- % DEFINITION 21reg_exp(Pattern, LPattern),member(Letter, [l,e,g]),(suffix_unavoidable(LPattern, Letter) -> write(suffix_unavoidable(Pattern,Letter)), nl ; true),fail.% | ?- try(incompressible).% incompressible(bump_on_decreasing_sequence)% incompressible(decreasing)% incompressible(decreasing_terrace)% incompressible(dip_on_increasing_sequence)% incompressible(increasing)% incompressible(increasing_terrace)% incompressible(plain)% incompressible(plateau)% incompressible(proper_plain)% incompressible(proper_plateau)% incompressible(steady)try(incompressible) :- % DEFINITION 22reg_exp(Pattern, LPattern),(incompressible(LPattern) -> write(incompressible(Pattern)), nl ; true),fail.% | ?- try(factor).% factor(bump_on_decreasing_sequence,5)% factor(decreasing,1)% factor(dip_on_increasing_sequence,5)% factor(increasing,1)% factor(steady,1)% factor(steady_sequence,1)% factor(strictly_decreasing_sequence,1)% factor(strictly_increasing_sequence,1)% factor(zigzag,3)try(factor) :- % DEFINITION 23reg_exp(Pattern, LPattern),pattern_smallest_size(Pattern, Minl),(factor(LPattern, Minl) -> write(factor(Pattern,Minl)), nl ; true),fail.convex(LPattern) :- % DEFINITION 3LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),L1 = (shuffle(shuffle(LPattern,s),s) /\(SigmaStar + [s] + LPattern + SigmaStar + LPattern + [s] + SigmaStar) /\(SigmaStar + [s] + (SigmaPlus\LPattern) + [s] + SigmaStar)),L2 = (shuffle(shuffle(shuffle(shuffle(LPattern,s),s),s),s) /\(SigmaStar + [s] + shuffle(LPattern,s) + [s] + SigmaPlus + [s] + SigmaStar) /\(SigmaStar + [s] + SigmaPlus + [s] + shuffle(LPattern,s) + [s] + SigmaStar) /\ SigmaStar + [s] + shuffle(shuffle(SigmaPlus\LPattern,s),s) + [s] + SigmaStar)),regex_kernel(L1, Automaton1),regex_kernel(L2, Automaton2),Automaton1 = kernel([],[]),Automaton2 = kernel([],[]).no_inflexion(LPattern) :- % DEFINITION 4LEG = {[l],[e],[g]},SigmaStar = *(LEG),L = (LPattern /\ (SigmaStar + [l] + SigmaStar + [g] + SigmaStar)) \/(LPattern /\ (SigmaStar + [g] + SigmaStar + [l] + SigmaStar)),regex_kernel(L, Automaton),Automaton = kernel([],[]).one_inflexion(LPattern) :- % DEFINITION 5Inf1 = (*([l] \/ [e]) + [l] + *([e]) + [g] + *([g] \/ [e])),Inf2 = (*([g] \/ [e]) + [g] + *([e]) + [l] + *([l] \/ [e])),L = (LPattern\(Inf1 \/ Inf2)),regex_kernel(L, Automaton),Automaton = kernel([],[]).single_letter(LPattern) :- % DEFINITION 6LEG = {[l],[e],[g]},L = LPattern\LEG,regex_kernel(L, Automaton),Automaton = kernel([],[]).exclude_out_in(LPattern) :- % DEFINITION 7LEG = {[l],[e],[g]},SigmaStar = *(LEG),SigmaPlus = (LEG + SigmaStar),L1 = (shuffle(shuffle(shuffle(shuffle(LPattern,s),s),s),s) /\(SigmaStar + [s] + SigmaPlus + [s] + shuffle(SigmaPlus\(SigmaStar+LPattern+SigmaStar),s) + [s] + SigmaStar) /\(SigmaStar + [s] + shuffle(LPattern,s) + [s] + SigmaStar + [s] + SigmaStar) /\(SigmaStar + [s] + SigmaPlus + [s] + SigmaPlus + [s] + SigmaStar + [s] + SigmaStar)),L2 = (shuffle(shuffle(shuffle(shuffle(LPattern,s),s),s),s) /\(SigmaStar + [s] + shuffle(SigmaPlus\(SigmaStar+LPattern+SigmaStar),s) + [s] + SigmaPlus + [s] + SigmaStar) /\(SigmaStar + [s] + SigmaStar + [s] + shuffle(LPattern,s) + [s] + SigmaStar) /\(SigmaStar + [s] + SigmaStar + [s] + SigmaPlus + [s] + SigmaPlus + [s] + SigmaStar)),regex_kernel(L1, Automaton1),regex_kernel(L2, Automaton2),Automaton1 = kernel([],[]),Automaton2 = kernel([],[]).letter(LPattern, Letter) :- % DEFINITION 20LEG = {[l],[e],[g]},L1 = (*(LEG\[Letter])) /\ LPattern,L2 = [Letter] /\ LPattern,regex_kernel(L1, Automaton1),regex_kernel(L2, Automaton2),Automaton1 = kernel([],[]),(Automaton2 = kernel([],[]) -> fail ; true).suffix_unavoidable(LPattern, Letter) :- % DEFINITION 21LEG = {[l],[e],[g]},SigmaStar = *(LEG),L1 = (*(LEG\[Letter])) /\ LPattern,L2 = (shuffle(LPattern,s) /\(SigmaStar + [s] + [Letter] + SigmaStar) /\(SigmaStar + [s] + (SigmaStar\LPattern))),regex_kernel(L1, Automaton1),regex_kernel(L2, Automaton2),Automaton1 = kernel([],[]),Automaton2 = kernel([],[]).incompressible(LPattern) :- % DEFINITION 22LEG = {[l],[e],[g]},SigmaStar = *(LEG), igmaPlus = (LEG + SigmaStar),L = ((SigmaPlus + LPattern + SigmaStar) /\ LPattern) \/((SigmaStar + LPattern + SigmaPlus) /\ LPattern),regex_kernel(L, Automaton),Automaton = kernel([],[]).factor(LPattern, Minl) :- % DEFINITION 23LEG = {[l],[e],[g]},SigmaStar = *(LEG),(Minl = 1 -> SigmaMinl = LEG ;Minl = 2 -> SigmaMinl = LEG+LEG ;Minl = 3 -> SigmaMinl = LEG+LEG+LEG ;Minl = 4 -> SigmaMinl = LEG+LEG+LEG+LEG ;Minl = 5 -> SigmaMinl = LEG+LEG+LEG+LEG+LEG ;write(minl_no_implemented(Minl)), nl, false),L = (shuffle(shuffle(LPattern,s),s) /\(SigmaStar + [s] + SigmaStar + SigmaMinl + SigmaStar + [s] + SigmaStar) /\(SigmaStar + [s] + (SigmaStar\LPattern) + [s] + SigmaStar)),regex_kernel(L, Automaton),Automaton = kernel([],[]).% reg_exp(pattern, reg_exp): l for <, e for =, g for >reg_exp(bump_on_decreasing_sequence, [l,l,g,l,l]). % <<><