[PDF] Faster Binary Mean Computation Under Dynamic Time Warping

Abstract

Many consensus string problems are based on Hamming distance. We replace Hamming distance by the more flexible (e.g., easily coping with different input string lengths) dynamic time warping distance, best known from applications in time series mining. Doing so, we study the problem of finding a mean string that minimizes the sum of (squared) dynamic time warping distances to a given set of input strings. While this problem is known to be NP-hard (even for strings over a three-element alphabet), we address the binary alphabet case which is known to be polynomial-time solvable. We significantly improve on a previously known algorithm in terms of worst-case running time. Moreover, we also show the practical usefulness of one of our algorithms in experiments with real-world and synthetic data. Finally, we identify special cases solvable in linear time (e.g., finding a mean of only two binary input strings) and report some empirical findings concerning combinatorial properties of optimal means.

Full PDF

FFaster Binary Mean Computation Under DynamicTime Warping

Nathan Schaar

Technische Universität Berlin, Faculty IV, Algorithmics and Computational [email protected]

Vincent Froese

Technische Universität Berlin, Faculty IV, Algorithmics and Computational [email protected]

Rolf Niedermeier

Technische Universität Berlin, Faculty IV, Algorithmics and Computational [email protected]

Abstract

Many consensus string problems are based on Hamming distance. We replace Hamming distanceby the more ﬂexible (e.g., easily coping with diﬀerent input string lengths) dynamic time warpingdistance, best known from applications in time series mining. Doing so, we study the problem ofﬁnding a mean string that minimizes the sum of (squared) dynamic time warping distances to agiven set of input strings. While this problem is known to be NP-hard (even for strings over athree-element alphabet), we address the binary alphabet case which is known to be polynomial-timesolvable. We signiﬁcantly improve on a previously known algorithm in terms of worst-case runningtime. Moreover, we also show the practical usefulness of one of our algorithms in experiments withreal-world and synthetic data. Finally, we identify special cases solvable in linear time (e.g., ﬁnding amean of only two binary input strings) and report some empirical ﬁndings concerning combinatorialproperties of optimal means.

Theory of computation → Dynamic programming; Theory ofcomputation → Pattern matching

Keywords and phrases consensus string problems, time series averaging, minimum 1-separated sum,sparse strings

Funding

Nathan Schaar : Partially supported by DFG NI 369/19. a r X i v : . [ c s . D M ] F e b Fast Binary DTW-Mean

Consensus problems are an integral part of stringology. For instance, in the frequently studied

Closest String problem one is given k strings of equal length and the task is to ﬁnd acenter string that minimizes the maximum Hamming distance to all k input strings. ClosestString is NP-hard even for binary alphabet [11] and has been extensively studied in contextof approximation and parameterized algorithmics [5, 7, 8, 9, 13, 15, 17, 20]. Notably, whenone wants to minimize the sum of distances instead of the maximum distance, the problem iseasily solvable in linear time by taking at each position a letter that appears most frequentlyin the input strings.Hamming distance, however, is quite limited in many application contexts; for instance,how to deﬁne a center string in case of input strings that do not all have the same length?In context of analyzing time series (basically strings where the alphabet consists of rationalnumbers), the “more ﬂexible” dynamic time warping distance [18] enjoys high popularity andcan be computed for two input strings in subquadratic time [12, 14], essentially matchingcorresponding conditional lower bounds [1, 3]. Roughly speaking (see Section 2 for formaldeﬁnitions and an example), measuring the dynamic time warping distance (dtw for short)can be seen as a two-step process: First, one aligns one time series with the other (bystretching them via duplication of elements) such that both time series end up with thesame length. Second, one then calculates the Euclidean distance of the aligned time series(recall that here the alphabet consists of numbers). Importantly, restricting to the binarycase, the dtw distance of two time series can be computed in O ( n . ) time [1], where n isthe maximum time series length (a result that will also be relevant for our work).With the dtw distance at hand, the most fundamental consensus problem in this (timeseries) context is, given k input “strings” (over rational numbers), compute a mean string thatminimizes the sum of (squared) dtw distances to all input strings. This problem is known as DTW-Mean in the literature and only recently has been shown to be NP-hard [4, 6]. Forthe most basic case, namely binary alphabet (that is, input and output are binary), however,the problem is known to be solvable in O ( kn ) time [2]. By way of contrast, if one allowsthe mean to contain any rational numbers, then the problem is NP-hard even for binaryinputs [6]. Moreover, the problem is also NP-hard for ternary input and output [4].Formally, in this work we study the following problem: Binary DTW-Mean (BDTW-Mean)

Input:

Binary strings s , . . . , s k of length at most n and c ∈ Q . Question:

Is there a binary string z such that F ( z ) := P ki =1 dtw( s i , z ) ≤ c ?Herein, the dtw function is formally deﬁned in Section 2. The study of the special case ofbinary data may occur when one deals with binary states (e.g., switching between the activeand the inactive mode of a sensor); binary data were recently studied in the dynamic timewarping context [16, 19]. Clearly, binary data can always be generated from more generaldata by “thresholding”.Our main theoretical result is to show that BDTW-Mean can be solved in O ( kn . )and O ( k ( n + m ( m − µ ))) time, respectively, where m is the maximum and µ is the mediancondensation length of the input strings (the condensation of a string is obtained by repeatedlyremoving one of two identical consecutive elements). While the ﬁrst algorithm, relies on anintricate “blackbox-algorithm” for a certain number problem from the literature (which so farwas never implemented), the second algorithm (which we implemented) is more directly basedon combinatorial arguments. Anyway, our new bounds improve on the standard O ( kn )-timebound [2]. Moreover, we also experimentally tested our second algorithm and compared it to . Schaar, V. Froese and R. Niedermeier 3 the standard one, clearly outperforming it (typically by orders of magnitude) on real-worldand on synthetic instances. Further theoretical results comprise linear-time algorithms forspecial cases (two input strings or three input strings with some additional constraints).Further empirical results relate to the typical shape of a mean. For n ∈ N , let [ n ] := { , . . . , n } . We consider binary strings x = x [1] x [2] . . . x [ n ] ∈ { , } n .We denote the length of x by | x | and we also denote the last symbol x [ n ] of x by x [ − ≤ i ≤ j ≤ | x | , we deﬁne the substring x [ i, j ] := x [ i ] . . . x [ j ]. A maximal substring ofconsecutive 1’s (0’s) in x is called a ( ). The i -th block of x (from left to right)is denoted x ( i ) . A string x is called condensed if no two consecutive elements are equal,that is, every block is of size 1. The condensation of x is denoted ˜ x and is deﬁned as thecondensed string obtained by removing one of two equal consecutive elements of x until theremaining series is condensed. Note that the condensation length | ˜ x | equals the number ofblocks in x .The dynamic time warping distance measures the similarity of two strings using non-linearalignments deﬁned via so-called warping paths. (cid:73) Deﬁnition 1. A warping path of order m × n is a sequence p = ( p , . . . , p L ) , L ∈ N , ofindex pairs p ‘ = ( i ‘ , j ‘ ) ∈ [ m ] × [ n ] , ≤ ‘ ≤ L , such that(i) p = (1 , ,(ii) p L = ( m, n ) , and(iii) ( i ‘ +1 − i ‘ , j ‘ +1 − j ‘ ) ∈ { (1 , , (0 , , (1 , } for each ‘ ∈ [ L − . A warping path can be visualized within an m × n “warping matrix” (see Figure 1). Theset of all warping paths of order m × n is denoted by P m,n . A warping path p ∈ P m,n deﬁnesan alignment between two strings x ∈ Q m and y ∈ Q n in the following way: A pair ( i, j ) ∈ p aligns element x i with y j with a local cost of ( x i − y j ) . The dtw distance between twostrings x and y is deﬁned asdtw( x, y ) := min p ∈P m,n s X ( i,j ) ∈ p ( x i − y j ) . It is computable via standard dynamic programming in O ( mn ) time [18], with recenttheoretical improvements to subquadratic time [12, 14]. We brieﬂy discuss some known results about the dtw distance between binary strings sincethese will be crucial for our algorithms for

BDTW-Mean .Abboud et al. [1, Section 5] showed that the dtw distance of two binary strings of lengthat most n can be computed in O ( n . ) time. They obtained this result by reducing thedtw distance computation to the following integer problem. Min 1-Separated Sum (MSS)

Input:

A sequence ( b , . . . , b m ) of m positive integers and an integer r ≥ Task:

Select r integers b i , . . . , b i r with 1 ≤ i < i < · · · < i r ≤ m and i j < i j +1 − ≤ j < r such that P rj =1 b i j is minimized. Throughout this work, we assume that all arithmetic operations can be carried out in constant time.

Fast Binary DTW-Mean

Figure 1

An optimal warping path for the strings x = 00101100101 (vertical axis) and y =0001100111 (horizontal axis). Black cells have local cost 1. The string x consists of eight blockswith sizes 2,1,1,2,2,1,1,1 and y consists of four blocks with sizes 3,2,2,3. An optimal warping pathhas to pass through (8 − / x . The integers of the

MSS instance correspond to the block sizes of the input string whichcontains more blocks. (cid:73)

Theorem 2 ([1, Theorem 8]) . Let x ∈ { , } m and y ∈ { , } n be two binary strings suchthat x [1] = y [1] , x [ m ] = y [ n ] , and | ˜ x | ≥ | ˜ y | . Then, dtw( x, y ) equals the sum of a solutionfor the MSS instance (cid:0) ( | x (2) | , . . . , | x ( | ˜ x |− | ) , ( | ˜ x | − | ˜ y | ) / (cid:1) . The idea behind Theorem 2 is that exactly ( m − n ) / x aremisaligned in any warping path (note that m − n is even since x and y start and end withthe same symbol). An optimal warping path can thus be obtained from minimizing thesum of block sizes of these misaligned blocks. For example, in Figure 1 the dtw distancecorresponds to a solution of the MSS instance ((1 , , , , , , MSS in O ( n . ) time, where n = P mi =1 b i . They gave a recursive algorithm that, on input (( b , . . . , b m ) , r ), outputs fourlists C , C ∗ , C ∗ , and C ∗∗ , where, for t ∈ { , . . . , r } , C ∗∗ [ t ] is the sum of a solution for the MSS instance (( b , . . . , b m ) , t ), C ∗ [ t ] is the sum of a solution for the MSS instance (( b , . . . , b m ) , t ), C ∗ [ t ] is the sum of a solution for the MSS instance (( b , . . . , b m − ) , t ), and C [ t ] is the sum of a solution for the MSS instance (( b , . . . , b m − ) , t ).Note that C ∗∗ [ r ] yields the solution. We will make use of their algorithm when solving BDTW-Mean . We will also use the following simple dynamic programming algorithm for

MSS which is faster for large input integers. (cid:73)

Lemma 3.

Min 1-Separated Sum is solvable in O ( mr ) time. Proof.

Let (( b , . . . , b m ) , r ) be an MSS instance. We deﬁne a dynamic programming table M as follows: For each i ∈ [ m ] and each j ∈ { , . . . , min( r, d i/ e ) } , M [ i, j ] is the sum of asolution of the subinstance (( b , . . . , b i ) , j ). Clearly, it holds M [ i,

0] = 0 and M [ i,

1] =min { b , . . . , b i } for all i . Further, it holds M [3 ,

2] = b + b . For all i ∈ { , . . . , m } and j ∈ { , . . . , min( r, d i/ e ) } , the following recursion holds M [ i, j ] = min( b i + M [ i − , j − , M [ i − , j ]) . . Schaar, V. Froese and R. Niedermeier 5 Hence, the table M can be computed in O ( mr ) time. (cid:74) Note that the above algorithms only compute the dtw distance between binary strings withequal starting and ending symbol. However, it is an easy observation that the dtw distanceof arbitrary binary strings can recursively be obtained from this via case distinction on whichﬁrst and/or which last block to misalign. (cid:73)

Observation 4 ([1, Claim 6]) . Let x ∈ { , } m , y ∈ { , } n with m := | ˜ x | ≥ n := | ˜ y | .Further, let a := | x (1) | , a := | x ( m ) | , b := | y (1) | , and b := | y ( n ) | . The following holds:If x [1] = y [1] , then dtw( x, y ) =  max( a, b ) , m = n = 1 a + dtw( x [ a + 1 , m ] , y ) , m > n = 1min (cid:0) a + dtw( x [ a + 1 , m ] , y ) , b + dtw( x, y [ b + 1 , n ]) (cid:1) , n > . If x [1] = y [1] and x [ m ] = y [ n ] , then dtw( x, y ) = ( a + dtw( x [1 , m − a ] , y ) , n = 1min (cid:0) a + dtw( x [1 , m − a ] , y ) , b + dtw( x, y [1 , n − b ]) (cid:1) , n > . For condensed strings, Brill et al. [2] derived the following useful closed form for thedtw distance (which basically follows from Observation 4 and Theorem 2). (cid:73)

Lemma 5 ([2, Lemma 1 and 2]) . For a condensed binary string x and a binary string y with | ˜ y | ≤ | x | , it holds that dtw( x, y ) =  d ( | x | − | ˜ y | ) / e , x = y , x = y ∧ | x | = | ˜ y | b ( | x | − | ˜ y | ) / c , x = y ∧ | x | > | ˜ y | . According to Lemma 5, one can compute the dtw distance in constant time when thecondensation lengths of the inputs are known and the string with longer condensation lengthis condensed.Our key lemma now states that the dtw distances between an arbitrary ﬁxed string andall condensed strings of shorter condensation length can also be computed eﬃciently. (cid:73)

Lemma 6.

Let s ∈ { , } n with ‘ := | ˜ s | . Given ‘ and the block sizes b , . . . , b ‘ of s , thedtw distances between s and all condensed strings of lengths ‘ , . . . , ‘ for some given ‘ ≤ ‘ can be computed in(i) O ( n . ) time and in(ii) O ( ‘ ( ‘ − ‘ )) time, respectively. Proof.

Let x be a condensed string of length i ∈ { ‘ , . . . , ‘ } . Observation 4 and Theorem 2imply that we essentially have to solve MSS on four diﬀerent subsequences of block sizes of s (depending on the ﬁrst and last symbol of x ) in order to compute dtw( s, x ). Namely, the fourcases are ( b , . . . , b ‘ − ), ( b , . . . , b ‘ − ), ( b , . . . , b ‘ − ), and ( b , . . . , b ‘ − ). Let r := d ( ‘ − ‘ ) / e ( i ) We run the algorithm of Abboud et al. [1, Theorem 10] on the instance (( b , . . . , b ‘ − ) , r )to obtain in O ( n . ) time the four lists C αβ , for α, β ∈ { , } , where C αβ contains thesolutions of (( b α , . . . , b ‘ − − β ) , r ) for all r ∈ { , . . . , r } . From these four lists, we cancompute the requested dtw distances (using Observation 4) in O ( ‘ ) time.( ii ) We compute the solutions of the four above MSS instances using Lemma 3. For each α, β ∈ { , } , let M αβ be the dynamic programming table computed in O ( ‘ ( ‘ − ‘ )) timefor the instance (( b α , . . . , b ‘ − − β ) , r ). Again, we can compute the requested dtw distancesfrom these four tables in O ( ‘ ) time (using Observation 4). (cid:74) Fast Binary DTW-Mean

Brill et al. [2] gave an O ( kn )-time algorithm for BDTW-Mean . The result is based onshowing that there always exists a condensed mean of length at most n + 1. Thus, thereare 2( n + 1) candidate strings to check. For each candidate, one can compute the dtw distanceto every input string in O ( kn ) time. It is actually enough to only compute the dtw distancefor the two length-( n + 1) candidates to all k input strings since the resulting dynamicprogramming tables also yield all the distances to shorter candidates. That is, the runningtime can actually be bounded in O ( kn ).We now give an improved algorithm. To this end, we ﬁrst show the following improvedbounds on the (condensation) length of a mean. (cid:73) Lemma 7.

Let s , . . . , s k be binary strings with | ˜ s | ≤ · · · ≤ | ˜ s k | and let z be a mean.Then, it holds µ − ≤ | ˜ z | ≤ m + 1 , where µ = | ˜ s d k/ e | is the median condensation lengthand m = | ˜ s k | is the maximum condensation length. Proof.

It suﬃces to show the claimed bounds for condensed means. Since dtw(˜ x, y ) ≤ dtw( x, y ) holds for all strings x , y [2, Proposition 1], the bounds also hold for arbitrarymeans.The upper bound m + 1 can be derived from Lemma 5. Let x be a condensed stringof length | x | ≥ m + 2 and let x := x [1 , m ]. If | x | > m + 2, then dtw( x , s i ) < dtw( x, s i ) holds for every i ∈ [ k ], which implies F ( x ) = P ki =1 dtw( s i , x ) < P ki =1 dtw( s i , x ) = F ( x ).Hence, x is not a mean. If | x | = m + 2, then dtw( x , s i ) ≤ dtw( x, s i ) holds for every i ∈ [ k ],that is, F ( x ) ≤ F ( x ). If F ( x ) < F ( x ), then x is clearly not a mean. If F ( x ) = F ( x ), thendtw( x , s i ) = dtw( x, s i ) holds for all i ∈ [ k ]. In fact, dtw( x , s i ) = dtw( x, s i ) only holdsif | ˜ s i | = m and s i [1] = x [1], in which case dtw( x, s i ) = 2. Thus, we have F ( x ) = 2 k and˜ s = ˜ s = · · · = ˜ s k . But then ˜ s is clearly the unique mean (with F (˜ s ) = 0).For the lower bound, let x be a condensed string of length ‘ < µ − x := x [1] . . . x [ ‘ ] x [ ‘ − x [ ‘ ]. Then, for every s i with | ˜ s i | ≤ ‘ (of which there are less than d k/ e since ‘ < µ ), it holds dtw( x , s i ) ≤ dtw( x, s i ) + 1 (by Lemma 5).Now, for every s i with | ˜ s i | > ‘ + 2 (of which there are at least d k/ e since ‘ + 2 < µ ),it holds dtw( x , s i ) ≤ dtw( x, s i ) −

1. This is easy to see from Theorem 2 for the casethat s i [1] = x [1] and s i [ −

1] = x [ −

1] holds since the number of misaligned blocks of s i decreases by at least one. From this, Observation 4 yields the other three possible casesof starting and ending symbols since the sizes of the ﬁrst and last block of x and of x areclearly all the same (one).It remains to consider input strings s i with ‘ < | ˜ s i | ≤ ‘ + 2. We show that in thiscase dtw( x , s i ) ≤ dtw( x, s i ) holds. Let | ˜ s i | = ‘ + 2. Note that then either x [1] = s i [1]and x [ −

1] = s i [ −

1] holds or x [1] = s i [1] and x [ − = s i [ −

1] holds. In the former case, itclearly holds dtw( x , s i ) = 0 by Lemma 5. In the latter case, we clearly have dtw( x, s i ) ≥ x , s i ) = 2. Finally, let | ˜ s i | = ‘ + 1 and note that theneither x [1] = s i [1] and x [ − = s i [ −

1] holds or x [1] = s i [1] and x [ −

1] = s i [ −

1] holds. Thus,we clearly have dtw( x, s i ) ≥

1. By Lemma 5, we have dtw( x , s i ) = 1.Summing up, we obtain F ( x ) ≤ F ( x ) + a − b , where a = |{ i ∈ [ k ] | | ˜ s i | < ‘ }| < d k/ e and b = |{ i ∈ [ k ] | | ˜ s i | > ‘ + 2 }| ≥ d k/ e . That is, F ( x ) < F ( x ) and x is not a mean. (cid:74) Note that the length bounds in Lemma 7 are tight. For the upper bound, consider the twostrings 000 and 111 having the two means 01 and 10. For the lower bound, consider theseven strings 0, 0, 0, 101, 101, 010, 010 with the unique mean 0. . Schaar, V. Froese and R. Niedermeier 7

Lemma 7 upper-bounds the number of mean candidates we have to consider in terms ofthe condensation lengths of the inputs. In order to compute the dtw distances between meancandidates and input strings, we can now use Lemma 6. We arrive at the following result. (cid:73)

Theorem 8.

Let s , . . . , s k be binary strings with | ˜ s | ≤ · · · ≤ | ˜ s k | and let n = max j =1 ,...,k | s j | , µ = | ˜ s d k/ e | , and m = | ˜ s k | . The condensed means of s , . . . , s k can be computed in(i) O ( kn . ) time and in(ii) O ( k ( n + m ( m − µ ))) time. Proof.

From Lemma 7, we know that there are O ( m − µ ) many candidate strings tocheck. First, in linear time, we determine the block lengths for each s j . Now, let x be acandidate string, that is, x is a condensed binary string with µ − ≤ | x | ≤ m + 1. Weneed to compute dtw( x, s j ) for each j = 1 , . . . , k . Consider a ﬁxed string s j . For allcandidates x with | x | ≥ | ˜ s j | , we can simply compute dtw( x, s j ) in constant time usingLemma 5. For all x with | x | < | ˜ s j | , we can use Lemma 6. Thus, overall, we can compute thedtw distances between all candidates and all input strings in O ( kn . ) time, or alternativelyin O ( km ( m − µ )) time. Finally, we determine the candidates with the minimum sum ofdtw distances in O ( k ( m − µ )) time. (cid:74) We remark that similar results also hold for the related problems

Weighted BinaryDTW-Mean , where the objective is to minimize F ( z ) := P ki =1 w i dtw( s i , z ) for some w i ≥

0, and

Binary DTW-Center with F ( z ) := max i =1 ,...,k dtw( s i , z ) (that is, the dtwversion of Closest String ). It is easy to see that also in these cases there exists a condensedsolution. Moreover, the length is clearly bounded between the minimum and the maximumcondensation length of the inputs. Hence, analogously to Theorem 8, we obtain the following. (cid:73)

Corollary 9.

Weighted Binary DTW-Mean and

Binary DTW-Center can be solvedin O ( kn . ) time and in O ( k ( n + m ( m − ν ))) time, where m is the maximum condensationlength and ν is the minimum condensation length. Notably, Theorem 8 (ii) yields a linear-time algorithm when m − µ is constant and also whenall input strings have the same length n and m ( m − µ ) ∈ O ( n ). Now, we show two morelinear-time solvable cases. (cid:73) Theorem 10.

A condensed mean of two binary strings can be computed in linear time.

Proof.

Let s , s ∈ { , } ∗ be two input strings. We ﬁrst determine the condensations andblock sizes of s and s in linear time. Let ‘ i := | ˜ s i | , for i ∈ [2], and assume that ‘ ≤ ‘ .In the following, all claimed relations between dtw distances can easily be veriﬁed usingObservation 4 (together with Theorem 2) and Lemma 5.If ‘ = ‘ , then, by Theorem 8 (with µ = m = ‘ ), all condensed means can be computedin O ( ‘ ) time.If ‘ < ‘ , then ˜ s is a mean. To see this, note ﬁrst that F (˜ s ) = dtw( s , ˜ s ) . Let x be a condensed string. If | x | < ‘ , then dtw( s , x ) > s , x ) ≥ dtw( s , ˜ s ) ≥ dtw(˜ s , ˜ s ) = dtw(˜ s , s ) . Thus, F ( x ) > F (˜ s ). Similarly, if | x | > ‘ , then dtw( s , x ) ≥ dtw( s , ˜ s ) , dtw( s , x ) >

0, and F ( x ) > F (˜ s ). If ‘ ≤ | x | < ‘ , then dtw( s , x ) +dtw( s , x ) ≥ dtw( s , ˜ s ) , and thus F ( x ) ≥ F (˜ s ). (cid:74) For three input strings, we show linear-time solvability if all strings begin with the samesymbol and end with the same symbol.

Fast Binary DTW-Mean (cid:73)

Theorem 11.

Let s , s , s be binary strings with s [1] = s [1] = s [1] and s [ −

1] = s [ −

1] = s [ − . A condensed mean of s , s , s can be computed in linear time. Proof.

We ﬁrst determine the condensations and block sizes of s , s , and s in linear time.Let ‘ i := | ˜ s i | , for i ∈ [3], and assume ‘ ≤ ‘ ≤ ‘ . Note that every mean starts with s [1]and ends with s [ − x with x [1] = s [1] (or x [ − = s [ − s [1] to the front (or s [ −

1] to the end) yields a better F -value. Moreover, it is easy to see that every condensedmean has length at least ‘ since increasing the length of any shorter condensed string by twoincreases the dtw distance to s by at most one (Lemma 5) and decreases the dtw distancesto s and s by at least one (Theorem 2).Note that a mean could be even longer than ‘ since further increasing the length bytwo increases the dtw distance to s and s by at most one and could possibly decreasethe dtw distance to s by at least two (if a misaligned block of size at least two canbe saved). In fact, we can determine an optimal mean length in O ( ‘ ) time by greedilycomputing the maximum number ρ of 1-separated (that is, non-neighboring) blocks of sizeone among s (2)3 , . . . , s ‘ − . Then there is a mean of length ‘ − ρ (that is, exactly ρ size-1blocks of s are misaligned). Clearly, any longer condensed string has a larger F -value andevery shorter condensed string has at least the same F -value. (cid:74) We strongly conjecture that similar but more technical arguments can be used to obtain alinear-time algorithm for three arbitrary input strings. For more than three strings, however,it is not clear how to achieve linear time, since the mean length cannot be greedily determined.

We conducted some experiments to empirically evaluate our algorithms and to observestructural characteristics of binary means. In Section 6.1 we compare the running times ofour O ( k ( n + m ( m − µ )))-time algorithm (Theorem 8 (ii)) with the standard O ( kn )-timedynamic programming approach [2] described in the beginning of Section 4. We implementedboth algorithms in Python. Note that we did not test the O ( kn . )-time algorithm sinceit uses another blackbox algorithm (which has not been implemented so far) in order tosolve MSS . However, we believe that it is anyway slower in practice. In Section 6.2, weempirically investigate structural properties of binary condensed means such as the lengthand the starting symbol (note that these two characteristics completely deﬁne the mean).All computations have been done on an Intel i7 QuadCore (4.0 GHz).For our experiments we used the CASAS human activity datasets [10] as well as somerandomly generated data. The data in the CASAS datasets are generated from sensors whichdetect (timestamped) changes in the environment (for example, a door being opened/closed)and have previously been used in the context of binary dtw computation [16]. We usedthe datasets HH101–HH130 and sampled from them to obtain input strings of diﬀerentlengths and sparsities (for a binary string s , we deﬁne the sparsity as | ˜ s | / | s | ). For the randomdata, the sparsity value was used as the probability that the next symbol in the string willbe diﬀerent from the last one (hence, the actual sparsities are not necessarily exactly thesparsities given but very close to those). Source code available at . Available at casas.wsu.edu/datasets/ . . Schaar, V. Froese and R. Niedermeier 9 Figure 2

Running times of the standard and the fast algorithm on sparse data (sensor D002 indataset HH101) in 10-minute intervals (left) and 1-minute intervals (right).

To examine the speedup provided by our algorithm, we compare it with the standard O ( kn )-time dynamic programming algorithm on (very) sparse real-world data (sparsity ≈ . ≈ .

01) and on sparse (sparsity ≈ .

1) and dense (sparsity ≈ .

5) random data,both for various values of k . Figure 2 shows the running times on real-world data. Forsparsity ≈ .

1, our algorithm is around 250 times faster than the standard algorithm andfor sparsity ≈ .

01 it is around 350 times faster. Figure 3 shows the running times of thealgorithms on larger random data. For sparsity ≈ .

1, our algorithm is still twice as fast for n = 10 ,

000 as the standard algorithm for n = 1000. These results clearly show that ouralgorithm is valuable in practice. We also studied the typical shape of binary condensed means. The questions of interestare “What is the typical length of a condensed mean?” and “What is the ﬁrst symbolof a condensed mean?”. Since the answers to these two questions completely determine acondensed mean, we investigated whether they can be easily determined from the inputs.To answer the question regarding the mean length, we tested how much the actual meanlength diﬀers from the median condensed length. Recall that by Lemma 7 we know thatevery condensed mean has length at least µ −

2, where µ is the median input condensationlength. We call this lower bound the median condensed length. We used our algorithm(Theorem 8 (ii)) to compute condensed means on random data with sparsities 0 . , . . . , . k = 1 , . . . ,

60 and n ≤ > . k of inputs, it canbe observed that, for sparse data (sparsity < . Figure 3

Running times of the standard algorithm for n ≤ n ≤ n ≤ ,

000 on sparse random data (right).

Figure 4

Diﬀerence between median condensed length and calculated mean length depending onsparsity and number of input strings. For every pair ( σ, k ) ∈ { . , . . . , . } × [60], we calculatedone mean for k strings with sparsity σ . No dot means that the median condensed length and themean length did not diﬀer by more than one. A blue (dark gray) dot means they diﬀered slightly(diﬀerence between two and four) and a red (light gray) dot means they diﬀered by at least ﬁve. larger k . This may be possible because more input strings increase the probability that thereis one input string with long condensation length and large block sizes. For dense inputs,there seems to be no real dependence on k .To answer the question regarding the ﬁrst symbol of a mean, we tested on random datawith diﬀerent k values and diﬀerent sparsities ( n ≤ ≈ k ).To sum up the above empirical observations, we conclude that a condensed binary mean . Schaar, V. Froese and R. Niedermeier 11 Table 1

Frequency (over 1000 runs) of the ﬁrst symbol of the mean also being the ﬁrst symbolin the majority of input strings. k /sparsity 0.05 0.1 0.2 0.5 0.8 15 76% 79% 82% 82% 82% 80%15 75% 81% 82% 83% 85% 85%40 82% 84% 88% 87% 91% 97% Table 2

Frequency (over 1000 runs) of the ﬁrst symbol of the mean also being the majority ofsymbols throughout the ﬁrst blocks of input strings. k /sparsity 0.05 0.1 0.2 0.5 0.8 15 69% 73% 75% 83% 85% 80%15 67% 73% 75% 82% 88% 85%40 66% 70% 74% 81% 91% 97% typically has a length close to the median condensed length and starts with the majoritysymbol among the starting symbols in the inputs. In this work we made progress in understanding and eﬃciently computing binary meansof binary strings with respect to the dtw distance. First, we proved tight lower and upperbounds on the length of a binary (condensed) mean which we then used to obtain fastpolynomial-time algorithms to compute binary means by solving a certain number problemeﬃciently. We also obtained linear-time algorithms for k ≤ n . This could be achievedby ﬁnding faster algorithms for our “helper problem” Min 1-Separated Sum ( MSS ). Canone solve

BDTW-Mean in linear time for every constant k (that is, f ( k ) · n time for somefunction f )? Also, ﬁnding improved algorithms for the weighted version or the center version(see Section 4) might be of interest. References A. Abboud, A. Backurs, and V. V. Williams. Tight hardness results for LCS and othersequence similarity measures. In , pages 59–78, 2015. 2, 3, 4, 5 M. Brill, T. Fluschnik, V. Froese, B. J. Jain, R. Niedermeier, and D. Schultz. Exact meancomputation in dynamic time warping spaces.

Data Mining and Knowledge Discovery ,33(1):252–291, 2019. 2, 5, 6, 8 K. Bringmann and M. Künnemann. Quadratic conditional lower bounds for string prob-lems and dynamic time warping. In , pages 79–97, 2015. 2 K. Buchin, A. Driemel, and M. Struijs. On the hardness of computing an average curve.

CoRR , abs/1902.08053, 2019. Preprint appeared at the . 2 L. Bulteau, F. Hüﬀner, C. Komusiewicz, and R. Niedermeier. Multivariate algorithmicsfor NP-hard string problems.

Bulletin of the EATCS , 114, 2014. 2 L. Bulteau, V. Froese, and R. Niedermeier. Tight hardness results for consensus problemson circular strings and time series.

CoRR , abs/1804.02854, 2018. 2 Z. Chen and L. Wang. Fast exact algorithms for the closest string and substringproblems with application to the planted ( ‘ , d )-motif model. IEEE/ACM TransactionsComputational Biology and Bioinformatics , 8(5):1400–1410, 2011. 2 Z. Chen, B. Ma, and L. Wang. A three-string approach to the closest string problem.

Journal of Computer and System Sciences , 78(1):164–178, 2012. 2 Z. Chen, B. Ma, and L. Wang. Randomized ﬁxed-parameter algorithms for the closeststring problem.

Algorithmica , 74(1):466–484, 2016. 2 D. Cook, A. Crandall, B. Thomas, and N. Krishnan. Casas: A smart home in a box.

Computer , 46, 2013. 8 M. Frances and A. Litman. On covering problems of codes.

Theory of Computing Systems ,30(2):113–119, 1997. 2 O. Gold and M. Sharir. Dynamic time warping and geometric edit distance: Breakingthe quadratic barrier.

ACM Transactions on Algorithms , 14(4):50:1–50:17, 2018. 2, 3 J. Gramm, R. Niedermeier, and P. Rossmanith. Fixed-parameter algorithms for closeststring and related problems.

Algorithmica , 37(1):25–42, 2003. 2 W. Kuszmaul. Dynamic time warping in strongly subquadratic time: Algorithms for thelow-distance regime and approximate evaluation. In

Proceedings of the 46th InternationalColloquium on Automata, Languages, and Programming (ICALP ’19) , pages 80:1–80:15,2019. 2, 3 M. Li, B. Ma, and L. Wang. On the closest string and substring problems.

Journal ofthe ACM , 49(2):157–171, 2002. 2 A. Mueen, N. Chavoshi, N. Abu-El-Rub, H. Hamooni, and A. Minnich. AWarp: Fastwarping distance for sparse time series. In , pages 350–359, 2016. 2, 8 N. Nishimura and N. Simjour. Enumerating neighbour and closest strings. In , pages252–263. Springer, 2012. 2 H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken wordrecognition.

IEEE Transactions on Acoustics, Speech, and Signal Processing , 26(1):43–49,1978. 2, 3

EFERENCES 13 A. Sharabiani, H. Darabi, S. Harford, E. Douzali, F. Karim, H. Johnson, and S. Chen.Asymptotic dynamic time warping calculation with utilizing value repetition.

Knowledgeand Information Systems , 57(2):359–388, 2018. 2 S. Yuasa, Z. Chen, B. Ma, and L. Wang. Designing and implementing algorithms for theclosest string problem.