Cost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows
Wentao Wu, Philip A. Bernstein, Alex Raizman, Christina Pavlopoulou
CCost-based Query Rewriting Techniques for OptimizingAggregates Over Correlated Windows ∗ Wentao Wu [email protected] Research
Philip A. Bernstein [email protected] Research
Alex Raizman [email protected]
Christina Pavlopoulou [email protected] of California, Riverside
ABSTRACT
Window aggregates are ubiquitous in stream processing. In AzureStream Analytics (ASA), a stream processing service hosted by Mi-crosoft’s Azure cloud, we see many customer queries that containaggregate functions (such as
MIN and
MAX ) over multiple corre-lated windows (e.g., tumbling windows of length five minutes andten minutes) defined on the same event stream. In this paper, wepresent a cost-based optimization framework for optimizing suchqueries by sharing computation among multiple windows. Sinceour optimization techniques are at the level of query rewriting,they can be implemented on any stream processing system thatsupports a declarative, SQL-like query language without changingthe underlying query execution engine. We formalize the sharedcomputation problem, present the optimization techniques in detail,and report evaluation results over synthetic workloads.
Near-real-time querying of data streams is required by many appli-cations, such as algorithmic stock trading, fraud detection, processmonitoring, and RFID event processing. The importance of this tech-nology has been growing due to the surge of demand for Internet ofThings (IoT) and edge computing applications, leading to a varietyof systems from both the open-source community (e.g., ApacheStorm [40], Apache Spark Streaming [7, 44], Apache Flink [14]) andthe commercial world (e.g., Amazon Kinesis [1], Microsoft AzureStream Analytics [2], Google Cloud Dataflow [3]). Although im-perative programming/query interfaces, such as the functional ex-pressions used in Trill [19] (see Figure 1(b) for an example), remainavailable in these stream processing systems, declarative SQL-likequery interfaces are becoming increasingly popular. For example,Apache Spark recently introduced structured streaming , a declara-tive streaming query API based on Spark SQL [7]. Azure StreamAnalysics (ASA), Microsoft’s cloud-based stream processing service,also differentiates itself with a SQL interface.Declarative query interfaces allow users of stream processingsystems to focus on what task is to be completed, rather than thedetails of how to execute it. When it comes to the question of ef-ficient query execution, they rely on powerful query optimizers.In the traditional world of database management systems, the suc-cess of declarative query languages heavily depends on cost-basedquery optimization , which has been an active area for research since1970’s [37]. Unfortunately, in spite of the increasing popularity of ∗ Work was done when Christina Pavlopoulou was at Microsoft.
SELECT
DeviceID, System.Window().Id,
Min (temperature) AS MinTemp,
FROM Input TIMESTAMP BY
EntryTime
GROUP BY
DeviceID, Windows(Window(‘10 min’, TumblingWindow( minute , 10), Window(‘20 min’, TumblingWindow( minute , 20), Window(‘30 min’, TumblingWindow( minute , 30), Window(‘40 min’, TumblingWindow( minute , 40)) (a) ASA query
Source.Multicast(s => s.Tumbling(“_10”).GroupAggregateWin(w,k, Min(e.a),(w, k, agg0) =>{w, k, agg0.Min}).Union(s.Tumbling(“_20”).GroupAggregateWin(w,k, Min(e.a),(w, k, agg0) =>{w, k, agg0.Min}).Union(s.Tumbling(“_30”).GroupAggregateWin(w,k, Min(e.a),(w, k, agg0) =>{w, k, agg0.Min}).Union(s.Tumbling(“_40”).GroupAggregateWin(w,k, Min(e.a),(w, k, agg0) =>{w, k, agg0.Min}))) (b) Translated Trill [19] expression
Figure 1: An ASA aggregation query over multiple windows. declarative query interfaces in stream processing systems, cost-based query optimization of such systems remains underdeveloped— most systems, if not all, rely on rule-based query optimizers.In this paper, we focus on cost-based optimization techniques forwindow aggregates, an ubiquitous category of streaming queries,in declarative stream processing systems. In our experience withASA, users often want to perform the same aggregate functionover the same data stream but with windows of different sizes.They do this for a variety of reasons, such as learning about ordebugging a stream by exploring its behavior over different timeperiods, reporting near real-time behavior of a stream over smallwindows as well as much longer windows (e.g., an hour vs. a week),and simultaneously supporting different users whose dashboardsdisplay stream behavior over different window sizes.A straightforward implementation would evaluate the aggregatefunction over each window separately. Although this implementa-tion is relatively simple, it potentially wastes CPU cycles. We startwith an example to illustrate this inefficiency.Example 1 (Multi-window Aggregate Query).
Figure 1(a)presents a query with a single aggregate function (i.e.,
MIN ) overmultiple windows. It returns the minimum temperature reported by a r X i v : . [ c s . D B ] S e p entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou Input StreamMultiCastAgg10Agg20Agg30 Agg40UnionMultiCast
Revised Plan
Input StreamMultiCastAgg10 Agg20 Agg30 Agg40Union
Original Plan (a) Query plan rewriting
Source.Tumbling(“_10”).GroupAggregateWin(w,k,Min(e.a),(w,k,agg0)=>{w, k,agg0.Min}).Multicast(s => s.Union(s.Tumbling(“_20”).GroupAggregateWin(w,k,Min(e.sagg0),(w,k,agg0)=>{w, k,agg0.Min}).Multicast(s => s.Union(s.Tumbling(“_40”).GroupAggregateWin(w,k,Min(e.sagg1),(w,k,agg0)=> {w,k, agg0.Min})))).Union(s.Tumbling(“_30”).GroupAggregateWin(w,k,Min(e.sagg0),(w,k,agg0)=> {w,k,agg0.Min}). (b) Translated Trill expression of the rewritten plan
Figure 2: Rewritten query by cost-based optimization. each device every 10, 20, 30, and 40 minutes. Figure 1(b) presents itsexecution plan in ASA, which runs the aggregate over each windowseparately and then takes a union of the results. The plan is showngraphically on the left side of Figure 2(a).
This execution plan is clearly inefficient. For example, the
MIN function over the 20-minute tumbling window can be computedfrom two consecutive tuples output by the 10-minute tumblingwindow, instead of computing it directly from the input stream.Such overlapping windows present an opportunity for optimization.Our cost-based optimization technique exploits this opportunityby finding the cheapest way of computing the four window aggre-gates in terms of the overall CPU overhead. It produces the revisedquery plan shown graphically on the right side of Figure 2(a). In-stead of computing the aggregate function over the four windowsseparately, the revised plan organizes the windows into a hierarchi-cal structure. As a result, downstream windows use sub-aggregatesfrom their upstream windows as inputs. For instance, aggregatesover the 20-minute and 30-minute windows are computed from sub-aggregates that are outputs of the 10-minute window, and aggre-gates of the 40-minute window are computed from sub-aggregatesthat are outputs of the 20-minute window. The revised plan’s graphis translated into a Trill [19] expression shown in Figure 2(b).
Comparison with Window Slicing.
We are not the first to noticethe opportunity to share computation when evaluating aggregatefunctions over multiple windows. An earlier well-known approachis window slicing . See [15] for a recent survey. Unlike our techniques,window slicing does not directly exploit the overlaps between win-dows by feeding sub-aggregates from upstream windows as inputsinto downstream windows. Rather, it creates shared data sketches called window slices over the input stream, which all windows canaccess to compute their aggregates.The main advantage of our technique over window slicing isthat it works at the query rewriting level without changing theimplementation of the underlying query processing engine. It canbe used with any declarative, SQL-style query language, which arepopular interfaces to stream processing systems. Our experimentalresults show the performance of our technique is comparable tostate-of-the-art window slicing techniques [29, 30].By contrast, window slicing techniques require runtime enginechanges for storage and access management of the created win-dow slices [41]. This is a non-trivial task in a modern engine forstream query processing, due to its use of code generation [19],multi-threading, concurrency control, etc. This may be why, to thebest of our knowledge, window slicing is not leveraged by existingcommercial (distributed) stream processing engines. Nonetheless,this does not rule out the possibility of pure compiler-level imple-mentations of (both existing and new) window slicing techniques.Some window-slicing runtime optimizations (e.g., [30]) are coveredby our compiler-level approach. We leave open the question ofwhether our technique covers all such optimizations.
Summary of contributions.
To summarize, this paper makes thefollowing contributions: • In Section 2, we introduce the window coverage graph (WCG),a formal model and data structure that captures the overlap-ping relationships between windows. • In Section 3, we propose a cost-based optimization frame-work using the WCG model, to minimize the computationcost of multi-window aggregate queries, as well as relatedquery rewritings on the optimal, min-cost WCG. • We extend the cost-based optimization framework in Sec-tion 4 by considering factor windows , which are auxiliarywindows that are not present in the query but can furtherreduce the overall computation cost. • We evaluate efficiency improvements of our proposed opti-mizations using synthetic workloads, presented in Section 4.Section 6 discusses related work, and Section 7 is the conclusion.
We start with a formal study of the overlapping relationships be-tween windows. We then propose window coverage graph , a formalmodel and data structure that captures overlapping relationshipsfor a given set of windows.
We follow the convention in the literature to represent a window W using two parameters [27]: • r – the range of W that represents its duration; • s – the slide of W that represents the gap between its twoconsecutive firings.Throughout this paper, we assume that s and r are integers anduse the same time unit (e.g., second, minute, hour). We assume0 < s ≤ r and write W ⟨ r , s ⟩ . ASA calls W a hopping window if s < r ,or a tumbling window if s = r . ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows A window set W = { W , ..., W n } represents a set of windowswith no duplicates. An aggregate function f defined over a windowset W computes a result for each W ∈ W and takes a union of theresults, i.e., f (W) = ∪ W ∈W f ( W ) . As an alternativeto the “range-slide” based representation, we can use a sequence of intervals to represent the lifetime of a window [10]. Without lossof generality, we assume the intervals are left-closed and right-open and define the interval representation of a window W ⟨ r , s ⟩ as W = {[ m · s , m · s + r )} , where m ≥ W ( , ) is {[ , ) , [ , ) , [ , ) , ... } . Now consider two windows W ⟨ r , s ⟩ and W ⟨ r , s ⟩ . Using theirinterval representations, we also have W = {[ m · s , m · s + r )} and W = {[ m · s , m · s + r )} where m ≥ m ≥ We say that W is coveredby W , denoted W ≤ W , if r > r and for any interval I = [ a , b ) in W there exist intervals I a = [ a , x ) and I b = [ y , b ) in W such that a > y and x < b . As a special case, a window is covered by itself.
Example 2 (Window Coverage).
Consider • W ⟨ s = , r = ⟩ ; • W ⟨ s = , r = ⟩ .Figure 3 plots the first two intervals of W ( {[ , ) , [ , )} ) and thefirst three intervals of W ( {[ , ) , [ , ) , [ , )} ). The first intervalof W is covered by the first and second intervals of W , and the secondinterval of W is covered by the second and third intervals of W . The following theorem provides sufficient and necessary condi-tions for the window coverage relation: Theorem 1. W is covered by W if and only if (1) s is a multipleof s and (2) δ r = r − r is a multiple of s . Example 3 (Window Coverage Theorem).
Consider again thewindows of Example 2: W ⟨ s = , r = ⟩ and W ⟨ s = , r = ⟩ .We have s / s = , so s is a multiple of s , and ( r − r )/ s = , so r − r is a multiple of s . By Theorem 1, W is covered by W . The window coverage relation defines a par-tial order over windows, as characterized by the following theorem:Theorem 2.
The window coverage relation is reflexive , antisym-metric , and transitive . Note that, if W is covered by W , then these two intervals are unique. Proofs are postponed to the appendix of this paper. W W time Figure 3: An example of window coverage.
Suppose that W ≤ W . For any interval I = [ a , b ) in W , let I a = [ a , x ) and I b = [ y , b ) be the two intervalsin W specified by Definition 1.Definition 2 (Covering Interval Set). Let the set of intervals“between” I a and I b in W be I a , b = {[ u , v ) : a ≤ u and v ≤ b } . Wecall I a , b the covering (interval) set of I . Clearly, I a , I b ∈ I a , b . The cardinality |I a , b | is independent ofthe choice of a and b . We call it the covering multiplier of W withrespect to W , denoted M ( W , W ) . An analytic form for the coveringmultiplier is given by the following theorem:Theorem 3. If the window W ⟨ r , s ⟩ is covered by the window W ⟨ r , s ⟩ , then M ( W , W ) = + ( r − r )/ s . We now introduce the more general notion of “interval coverage”based on the above discussion.Definition 3 (Interval Coverage).
We say that an interval I is covered by a set of intervals I if I = ∪ J ∈I J . Example 4 (Interval Coverage).
In Figure 3, for the first intervalin W the covering set consists of the first and second intervals in W ,and for the second interval in W consists of the second and thirdintervals in W . A special case of interval cov-erage is when the intervals in the covering set are disjoint.Definition 4 (Interval Partitioning).
If an interval I is cov-ered by a set of intervals I such that the intervals in I are mutuallyexclusive, then I is partitioned by I . We can further define “window partitioning” accordingly, whichis a special case of window coverage:Definition 5 (Window Partitioning).
We say that W is par-titioned by W , if W is covered by W and each interval in W is partitioned by its covering set in W . Figure 4 illustrates the difference between window partitioningand general window coverage. Here each interval of W is cov-ered by two intervals of W , i.e., M ( W , W ) =
2. We now providerigorous conditions for window partitioning:Theorem 4. W is partitioned by W if and only if (1) s is amultiple of s , (2) r is a multiple of s , and (3) r = s (i.e., W is atumbling window). Example 5 (Window Partitioning).
In Example 2 s / s = and r / s = . So conditions (1) and (2) in Theorem 4 hold. However,condition (3) is violated since r (cid:44) s (i.e., W is not a tumblingwindow). As a result, W cannot be partitioned by W . entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou IJ’ J’’W W (a) Window partitioning IJ’ J’’W W (b) Window coverage Figure 4: A comparison of window partitioning with generalwindow coverage.
We define the windows coverage graph G = (W , E) for a givenwindow set W based on the partial order introduced by the windowcoverage relation. For every W , W ∈ W such that W ≤ W , weadd an edge e = ( W , W ) to the edge set E . The time complexity ofconstructing the WCG is O (|W| ) , given that checking the windowcoverage relationship takes only constant time (Theorems 1 and 4). We now study the problem of evaluating aggregate functions overa window set that is modeled by its WCG. We first revisit a classictaxonomy of aggregate functions in the new context of windowset and WCG. We then present a cost-based framework for theWCG, with the goal of minimizing the overall computation cost.We further present query rewriting techniques with respect to anoptimal WCG.
Let f be a given aggregate function, e.g., MIN , MAX , AVG , and so on.Gray et al. classified f into three categories [22]: • Distributive – f is distributive if there is some function д such that, for a table T , f ( T ) = д ({ f ( T ) , ..., f ( T n )}) where T = { T , ..., T n } is a disjoint partition of T . Typical examplesinclude MIN , MAX , COUNT , and
SUM . In fact, f = д for MIN , MAX , and
SUM but for
COUNT д should be SUM . • Algebraic – f is algebraic if there are functions д and h suchthat f ( T ) = h ({ д ( T ) , д ( T ) , ..., д ( T n )}) . Typical examples are AVG and
STDEV . For
AVG , д records the sum and count foreach subset T i (1 ≤ i ≤ n ) and h computes the average for T i by dividing the sum by the count. • Holistic – f is holistic if there is no constant bound on thesize of storage needed to describe a sub-aggregate. Typicalexamples include MEDIAN and
RANK .Only distributive or algebraic aggregate functions can be com-puted by aggregating sub-aggregates [11, 41]. One important pre-requisite in this taxonomy is that T = { T , ..., T n } is a partition of T . In our scenario, it means that if we want to evaluate f overa window W by aggregating sub-aggregates that have been com-puted over another window W , then W has to be partitioned by W .Theorem 5. Given that window W is partitioned by window W ,if the aggregate function f is either distributive or algebraic, then f over W can be computed by aggregating sub-aggregates over W . Although recent work [11, 41] on window slicing “supports” holistic functions, thecorresponding window slices contain all input events rather than sub-aggregates.
Rs s …... r ( n - 1) s Figure 5: Illustration of the recurrence count. If W is only covered (but not partitioned) by W , then the type ofaggregate function f that can be computed using Theorem 5 mustbe further restricted, such that f remains distributive or algebraiceven if the T i ’s in T can overlap. The aggregate functions MIN and
MAX retain such properties, as stated by the following theorem:Theorem 6.
The aggregate functions
MIN and
MAX are distributiveeven if T is not disjoint. Given a streaming query q that contains an aggregate function f over a window set W , our goal is to minimize the total computationoverhead of evaluating q . A naive approach to evaluate q is tocompute f over each window of W one by one. Clearly, this will doredundant computation if the windows in W “overlap.” To minimizecomputation one needs to maximize the amount of computationthat is shared among overlapping windows. We present a cost-basedoptimization framework that does this by exploiting the windowcoverage relationships captured by the WCG of W . We use the following cost model to capturethe computation overhead in evaluating windowed aggregates.Let W = { W , ..., W n } be a window set. Given the WCG G = (W , E) , we assign a weight c i to each vertex (i.e., window) W i in W that represents its computation cost with respect to the (given)aggregate function f . The total computation cost is simply the sum of these weights, i.e., C = (cid:213) ni = c i . Our goal is to minimize C .We assume that the cost of computing f is proportional to thenumber of events processed. We further assume a steady input eventrate η ≥
1. Let R = lcm ( r , ..., r n ) be the least common multiple ofthe ranges of the windows W ⟨ r , s ⟩ , ..., W n ⟨ r n , s n ⟩ in W . For eachwindow W i , the cost c i of computing f over W i for events in a periodof length R depends on two quantities: • Recurrence count n i – the number of intervals (i.e., instances)of W i occurring during the period of R ; • Instance cost µ i – the cost of evaluating each instance of W i .Clearly, c i = n i · µ i . We next analyze the two quantities. Recurrence count.
For each window W i , let m i = R / r i be its multiplicity . Then the recurrence count n i of W i can be written as: n i = + ( m i − ) r i s i . (1) ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows Figure 5 illustrates how we obtained the above formula for n i .Essentially, we have R = ( n i − ) · s i + r i , which yields n i = + R − r i s i = + (cid:16) Rr i − (cid:17) r i s i = + ( m i − ) r i s i . If W i is a tumbling window, then n i = m i . In this paper we assumethat r i is a multiple of s i so that n i is an integer. Instance cost.
Clearly, without any computation sharing, theinstance cost of W i is µ i = η · r i . Sharing computation, howeverenables reducing the computation cost. Consider W ⟨ r , s ⟩ and W ⟨ r , s ⟩ . We have the following observation:Observation 1. If W is covered by (perhaps multiple) W ’s, thenthe instance cost of W can be reduced to µ = min W s.t. W ≤ W { n · M ( W , W )} . Algorithm 1 presents our procedure forfinding the minimum overall cost based on the WCG, cost model,and Observation 1. It starts by constructing the WCG G with respectto the given window set W and aggregate function f (line 1) – weneed f to know whether to use “covered by” or “partitioned by”when constructing WCG. We then process the windows one byone (lines 2 to 5).For each window W i , at line 3 we initialize its cost with c i = n i · ( η · r i ) . We then iterate over incoming edges ( W ′ , W i ) , revisingthe cost c i with respect to Observation 1 (lines 4 to 5). Finally, weremove all edges that do not correspond to the one that led to theminimum cost (lines 6 to 7). The result is graph G min , called the min-cost WCG hereafter in this paper, which captures all minimumcost information. It is the input to the query rewriting algorithmwe will discuss in Section 3.3. Algorithm 1:
Find the min-cost WCG.
Input: W = { W i } ni = , a window set; f , an aggregate function. Output: G min , the min-cost WCG w.r.t. W and f . Construct the WCG G = (W , E) w.r.t. “covered by” or“partitioned by” as determined by f ; foreach W i ∈ W do Initialize its cost c i ← n i · ( η · r i ) ; foreach W ′ ∈ W s.t. ( W ′ , W i ) ∈ E do Revise cost c i ← min { c i , n i · M ( W i , W ′ )} ; foreach W i ∈ W do Remove all incoming edges that do not correspond to (thefinal value of) c i ; return the result graph G min ;Example 6. Consider the ASA query in Figure 1(a), which con-tains four tumbling windows: W ⟨ , ⟩ , W ⟨ , ⟩ , W ⟨ , ⟩ , and If we want n i to be an integer when r i is not a multiple of s i , m i − must be a multipleof s i . Thus, m i − = l i · s i where l i is an integer, which yields R = r i ( + l i · s i ) ,for all ≤ i ≤ n . Therefore, all n i ’s are integers only if there exist integers l , ..., l n such that r ( + l · s ) = · · · = r n ( + l n · s n ) . We leave the case when n i ’s maynot be integers for future work. In our current implementation, we use “covered by” semantics when f is MIN or MAX , and “partitioned by” when f is COUNT , SUM , and
AVG , which are part of the SQLstandard. Future work could expand these two lists with other aggregate functions. The initial cost is c i = m i · ( η · r i ) = η · R if W i is a tumbling window. W ⟨ , ⟩ . It does not matter which aggregate function f we choosehere, since “covered by” and “partitioned by” semantics coincide whenall windows in a window set are tumbling windows.Assuming an incoming event ingestion rate η = , the total costof computing the four windows is C = ηR = R = , where R = lcm { , , , } = .Figure 6 shows the initial WCG (Figure 6(a)) and the final min-cost WCG (Figure 6(b)) by running Algorithm 1, when exploiting theoverlaps between the windows. We next walk through how Algorithm 1produces the min-cost WCG. For the four tumbling windows, we have n = m = R / r = / = , n = m = R / r = / = , n = m = R / r = / = , n = m = R / r = / = . Following the window coverage relationship captured by the initialWCG, we can compute the corresponding covering multipliers for theedges as follows: M ( W , W ) = + ( r − r )/ s = + ( − )/ = , M ( W , W ) = + ( r − r )/ s = + ( − )/ = , M ( W , W ) = + ( r − r )/ s = + ( − )/ = , M ( W , W ) = + ( r − r )/ s = + ( − )/ = . As a result, Algorithm 1 will pick W as the (unique) upstream windowfor W , and pick W as the (unique) upstream window for W and W in the final min-cost WCG. The total cost is therefore reduced to C ′ = c ′ + c ′ + c ′ + c ′ = R + n · M ( W , W ) + n · M ( W , W ) + n · M ( W , W ) = + · + · + · = , a 62.5% reduction from the initial cost C = . W (10, 10)W (20, 20) W (30, 30)W (40, 40) (a) Initial WCG c = R = 120c = n *M(W , W ) = 12c = n *M(W , W ) = 6c = n *M(W , W ) = 12 W (10, 10)W (20, 20) W (30, 30)W (40, 40) (b) Min-cost WCG Figure 6: WCG and min-cost WCG for Example 1.
To leverage the benefits of shared window computation, we rewritethe original ASA query plan with respect to the min-cost WCG G min based on the following observation:Theorem 7. The min-cost WCG G min is a forest, i.e., a collectionof trees. entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou The proof follows directly from noticing that each window in G min has at most one incoming edge (due to lines 6 to 7).Figure 2 shows how we revise the query execution plan in Ex-ample 1. Figure 2(a) presents the original plan and the revised planbased on the min-cost WCG. Figure 2(b) presents the translatedTrill expression [19].Formally, given G min that captures the optimal window cover-age relationships, the query rewriting algorithm works as follows.Suppose that the original plan isInput Stream ⇒ MultiCast ⇒ W = { W , ..., W n } ⇒ Union . We first replace W by the min-cost WCG G min :Input Stream ⇒ MultiCast ⇒ G min ⇒ Union . We then perform the following steps: • For each window w (in G min ) without an incoming edge,create a link from MultiCast to w . Remove the MultiCast operator if there is only one such w . • For each (intermediate) window v with outgoing edges, in-sert a MultiCast operator M v . Create a link from v to M v and a link from M v to Union . For each ( v , u ) of v ’s outgoingedges, create a link from M v to u . • For each window w without outgoing edges, create a linkfrom w to Union . We have been confining our discussion to sharing computation overwindows in the given window set. One can add auxiliary windowsthat are not in the window set but may nevertheless help reducethe overall computation cost. We call them factor windows .Definition 6.
Given a window set W , a window W is called a factor window with respect to W if W (cid:60) W and there exists somewindow W ′ ∈ W such that W ′ ≤ W . Note that we do not expose the results of factor windows tousers, as they are not part of the user query.Example 7.
Suppose we modify the query in Figure 1(a) by re-moving the tumbling window W ( , ) . The resulting query Q con-tains three tumbling windows W ( , ) , W ( , ) , and W ( , ) .The cost of directly computing them is C = R = , as here R = lcm { , , } = remains the same.If we apply Algorithm 1 over Q , we get the min-cost WCG presentedin Figure 7(a). As a result, the overall cost is C ′ = c ′ + c ′ + c ′ = R + R + n · M ( W , W ) = + + · = , a reduction of 31.7% from the baseline cost C = .If we allow factor windows and apply Algorithm 2 over Q , then weget the min-cost WCG in Figure 7(b). Window W ( , ) is “addedback” as a factor window, which participates in evaluating Q but doesnot expose its result to users. As in Example 6(a), the overall cost nowis C ′′ = , which is 58.3% less than the baseline cost C = and39% less than the cost C ′ = without using factor windows. S(1, 1)W (20, 20) W (30, 30)W (40, 40) c = R =120c = R = 120c = n *M(W , W ) = 6 (a) Initial WCG S(1, 1)W (20, 20) W (30, 30)W (40, 40) W (10, 10) c = 120c = 12 c = 12c = 6 (b) Min-cost WCG Figure 7: Min-cost WCGs for Example 1 with and withoutusing factor windows. W W …... W K W (a) Interesting W W …... W K W (b) Uninteresting Figure 8: Two basic patterns in WCG ( K ≥ ). One natural question to ask is: When does a factor window help?In the following, we provide a formal analysis.
Augmented WCG.
For the WCG G = (W , E) induced by thegiven window set W and aggregate function f , we add a virtualtumbling window S ⟨ r = , s = ⟩ into W , and add an edge ( S , W ) into E for each W ∈ W that has no incoming edges (i.e., W is notcovered by any other window). However, if such an S already existsin W , we do not add another one. Intuitively, S represents a windowconsisting of atomic intervals that emit an aggregate for each timeunit; therefore S covers all windows in W . The computation costof S is always η · R , as it cannot be covered by any other window.This augmented graph is a directed acyclic graph (DAG) with asingle “root” S . From now on, when we refer to the WCG we meanits augmented version. Two Basic Patterns.
Figure 8 presents two basic patterns in (theaugmented) WCG, for an arbitrary window W ∈ W . We are inter-ested in the pattern in Figure 8(a) but not the one in Figure 8(b), as W can only affect the costs of its downstream windows. This eliminateswindows in WCG without outgoing edges from consideration. Analysis of Impact.
As shown in Figure 9, let W f be a factorwindow inserted “between” W and its downstream windows W ,..., W n . We can do this for all “intermediate” vertices, i.e., windowswith both incoming and outgoing edges, in (the augmented) WCG,thanks to the added virtual “root” S . Clearly, we have W f ≤ W ,as well as W j ≤ W f for 1 ≤ j ≤ K . We now compare the overallcomputation costs with and without inserting W f . The cost withthe factor window W f is c = (cid:213) Kj = cost ( W j ) + cost ( W f ) + cost ( W ) . On the other hand, the cost without W f is c ′ = (cid:213) Kj = cost ′ ( W j ) + cost ( W ) . Note that the augmented WCG honors the same “covered by” or “partitioned by”semantics determined by the aggregate function f , when adding factor windows..6 ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows W W …... W K W W W …... W K WW f Figure 9: Impact of factor window W f . Sincecost ( W j ) = n j · M ( W j , W f ) , cost ( W f ) = n f · M ( W f , W ) , cost ′ ( W j ) = n j · M ( W j , W ) , it then follows that c − c ′ = (cid:213) Kj = n j (cid:16) M ( W j , W f ) − M ( W j , W ) (cid:17) + n f M ( W f , W ) . By Theorem 3, M ( W j , W f ) = + ( r j − r f )/ s f , M ( W j , W ) = + ( r j − r W )/ s W , and M ( W f , W ) = + ( r f − r W )/ s W . Substituting into the above equation, we obtain c − c ′ = (cid:213) Kj = n j (cid:16) r j − r f s f − r j − r W s W (cid:17) + n f (cid:16) + r f − r W s W (cid:17) . We now define the following quantities to simplify notation: ρ j = r j / r f , k j = r j / s j , ≤ j ≤ K , k f = r f / s f , and k W = r W / s W . With this simplified notation, we have c − c ′ = n f (cid:16) (cid:213) Kj = n j n f (cid:16) r j s f − k f − r j s W + k W (cid:17) + ( + r f s W − k W ) (cid:17) . (2)Inserting W f improves if and only if c ≤ c ′ , i.e., (cid:213) Kj = n j n f (cid:16) r j s f − k f − r j s W + k W (cid:17) + ( + r f s W − k W ) ≤ . (3) We can use Equation 3 to determine whether a factor window isbeneficial, the next problem is to find all candidate factor windowsthat are beneficial, from which we can select the best one. We nowdiscuss this candidate generation and selection procedure in detail.
In general, there is no good way togenerate all candidate factor windows other than enumerating allwindows W f ⟨ r f , s f ⟩ that conform to the pattern in Figure 9 withrespect to conditions required by the window coverage relation-ship (i.e., Theorem 1). Specifically, the candidate generation phaseconsists of two steps: • Generation of eligible slides : Let s d = gcd { s , ..., s K } . The set ofcandidates s f is S f = { s f : s d mod s f = s f mod s W = } . • Generation of eligible ranges : Let r min = min { r , ..., r K } . For each s f ∈ S f , the set of candidates r f is R f = { r f : r f mod s f = r f ≤ r min } . Algorithm 2:
Find the min-cost WCG when factor windowsare allowed.
Input: W = { W i } ni = , a window set; f , an aggregate function. Output: G min , the min-cost WCG w.r.t. W and f , wherefactor windows are allowed. Main : Construct the WCG G = (W , E) w.r.t. “covered by” or“partitioned by” determined by f ; foreach W ∈ W do W f ← FindBestFactorWindow ( W , W ’s downstreamwindows { W , ..., W K } ); Expand G by adding W f and the corresponding edges (asin Figure 9); G min ← Run lines 2-7 of Algorithm 1 over the expanded G ; return the result graph G min ; FindBestFactorWindow ( W , { W , ..., W K } ): Construct the set W f of candidate factor windows w.r.t.Figure 9; Remove candidates from W f that are not beneficial, usingEquation 3; return the best W f ∈ W f w.r.t. the maximum estimated costreduction by Equation 2; For each eligible pair ( s f , r f ) , we construct a candidate factor win-dow W f ⟨ r f , s f ⟩ and check the window coverage constraints inFigure 9, i.e., W f ≤ W and W k ≤ W f for 1 ≤ k ≤ K . Only if W f is beneficial (by Equation 3) should we include it in the candidatefactor windows W f . Many candidate factor windows in W f may be beneficial (i.e., Equation 3 holds). Only the one that leadsto the minimum overall cost should be added. To select it from thecandidates, we simply compare their estimated cost reduction byEquation 2 and pick one that leads to the maximum cost reduction. Algorithm 2 is the revised version of Algorithm 1 that returns themin-cost WCG when factor windows are allowed. It first extendsthe original WCG by adding the best factor windows found forexisting windows (lines 3 to 5), using techniques in Sections 4.1and 4.2 (in lines 9 to 12). It then simply invokes Algorithm 1 onthe extended WCG (rather than the original one) to find the newmin-cost WCG that contains factor windows (line 6).Unfortunately, Algorithm 2 is no longer optimal, unlike Algo-rithm 1. In fact, the cost minimization problem when factor win-dows are allowed is an instance of the Steiner tree problem [28],which is NP-hard. Various approximate algorithms have been pro-posed for Steiner trees (e.g., [13, 36]), but we choose to stay withAlgorithm 2 because it is simple and easy to implement. In our ex-perimental evaluation, Algorithm 2 often outperforms Algorithm 1,returning a WCG with orders-of-magnitude lower cost. Neverthe-less, since we cannot guarantee the optimality of the WCG returnedby Algorithm 1, in practice we compare the costs of the WCGs re-turned by Algorithms 1 and 2, and return the better one. entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou We can improve the procedure
FindBestFactorWindow in Algo-rithm 2 if we restrict the window coverage relationships to “par-titioned by” semantics, which works for more types of aggregatefunctions. In this special case, the candidate factor windows arerestricted to tumbling windows (by Theorem 4).
Algorithm 3:
Determine whether a factor window would helpunder “partitioned by” semantics.
Input:
A factor window W f , and a target window W with itsdownstream windows W , ..., W K . Output:
Return true if adding W f improves the overall cost, false otherwise. if K ≥ then return true ; // We have K = if k = then return false ; else // We have k > if k ≥ and m ≥ then return true ; else Compute r f r W and λλ − = + m ( m − )( k − ) ; return true if r f r W ≥ λλ − , false otherwise ; We first revisit the prob-lem of determining whether a factor window is beneficial, under“partitioned by” semantics. Algorithm 3 summarizes the procedurethat determines whether a factor window W f would help in thecase of “partitioned by.” Here, the quantity λ is defined as λ = (cid:213) Kj = n j m j , (4)The procedure in Algorithm 3 looks complicated. We offer someintuition below to help understand it: (Case 1) If W f only has one downstream window W that is tum-bling (i.e., the case when K = k = W to compute W f itself. Without W f one can use the samesub-aggregates to compute W directly. (Case 2) If W f has two or more downstream windows (i.e., when K ≥ W f (rather than from W ). We provide more explanation usinga special case (referring to Figure 9) when K = k f = k W = r f = s f , and r W = s W , since both W f and W are tumbling windows: c − c ′ = (cid:213) j = n j · (cid:16) r j r f − r j r W (cid:17) + n f · r f r W . Moreover, since all windows are tumbling, n j = m j = R / r j for j ∈ { , } , and n f = m f = R / r f . As a result, c − c ′ = R · (cid:16) r f − r W (cid:17) ≤ , since r f ≥ r W by Theorem 4. The case when W f has one uniquedownstream window W that is not tumbling (i.e., when K = k >
1) can be understood in a similar way, as sub-aggregates from W f can reduce cost for intervals in W that overlap.In the appendix of this paper, we provide a formal analysis ofthe correctness of Algorithm 3 (using Equations 2 and 4):Theorem 8. Algorithm 3 correctly determines whether W f wouldhelp when both W f and W are tumbling windows. We now re-visit the problems of candidate generation and selection under“partitioned by” semantics. (Candidate Generation)
By restricting to tumbling windows un-der “partitioned by” semantics, we can significantly reduce thesearch space for potential candidates. By Theorem 4, the range r f of a factor window W f must be a common factor of the ranges r ,..., r K of all downstream windows W , ..., W K for a given targetwindow W (ref. the pattern in Figure 9). Moreover, r f must also bea multiple of the range r W of the target window W . As a result, onecan enumerate all candidates by starting from the greatest commondivisor r of r , ..., r K and look for all factors r f of r that are alsomultiples of r W . (Candidate Selection) To find the best factor window, we comparethe benefits of two candidates W f and W ′ f . There are two cases asshown in Figure 10: • W f and W ′ f are dependent , meaning either W f ≤ W ′ f or W ′ f ≤ W f – Figure 10(a); • W f and W ′ f are independent – Figure 10(b). Dependent Candidates.
Let W f and W ′ f be two eligible factorwindows such that W ′ f ≤ W f . Then W f can be omitted as adding itcannot reduce the overall cost. This can be understood by runningAlgorithm 3 against W f , by viewing W ′ f as W f ’s only (tumbling)downstream window. Algorithm 3 would return false as this isthe case when K = k = W W …... W K WW’ f W f (a) Dependent W W …... W K W W’ f W f (b) Independent Figure 10: Dependent and independent factor windowswhen multiple candidates exist. ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows Independent Candidates.
For the independent case, we have tocompare the costs in more detail. Specifically, let c f = (cid:213) Kj = cost ( W j ) + cost ( W f ) + cost ( W ) = (cid:213) Kj = n j · M ( W j , W f ) + n f · M ( W f , W ) + cost ( W ) , and c ′ f = (cid:213) Kj = cost ( W j ) + cost ( W ′ f ) + cost ( W ) = (cid:213) Kj = n j · M ( W j , W ′ f ) + n ′ f · M ( W ′ f , W ) + cost ( W ) . Theorem 9.
Let W f and W ′ f be two eligible factor windows thatare independent under “partitioned by” semantics. Then c f ≤ c ′ f iff r f r ′ f ≥ λ − r f r W λ − r ′ f r W . (5) Here λ has been defined in Equation 4. Since λ is a constant that does not depend on W f or W ′ f , Equa-tion 5 implies that the comparison of costs boils down to evaluatingthree quantities: r f r ′ f , r f r W , and r ′ f r W . Algorithm 4:
Pick the best factor window under “partitionedby” semantics.
Input:
A target window W with its downstream windows W ,..., W K . Output:
Return the best factor window W f that led to theminimum overall cost. Find the greatest common divisor (GCD) of the ranges of thedownstream windows d = gcd ({ r , ..., r K }) ; if d = r W then return W ; Find all factors F of d that are multiples of r W ; W f ← ∅ ; foreach r f ∈ F do Construct a tumbling window W f with range r f ; Run Algorithm 3 for W f , W , and W , ..., W K ; if Algorithm 3 returns true then W f ⇐ W f ∪ { W f }; foreach W f ∈ W f do if there exists W ′ f s.t. W ′ f ≤ W f then W f ⇐ W f − { W f } ; return the best W f ∈ W f by Theorem 9 ;Algorithm 4 presents the details of picking the best factor win-dow for a target window W and its downstream windows W , ..., W K , under “partitioned by” semantics. It starts by enumerating allcandidates for W f based on the constraint that r f must be a commonfactor of { r , ..., r K } and a multiple of r W (lines 1 to 4). It simplyreturns W if no candidate can be found (line 3). It then filters outcandidates of W f that are not beneficial, using Algorithm 3 (lines 5to 10). It further prunes dependent candidates that are dominated by others (lines 11 to 13). Finally, it finds the best W f from theremaining candidates, with respect to Theorem 9.Example 8. Continuing with Example 7, Algorithm 4 would gen-erate three candidate factor windows W ( , ) , W ( , ) , and W ( , ) ,since all of them are beneficial according to Algorithm 3 ( K = indeed). However, since both W ( , ) and W ( , ) cover W ( , ) ,these two candidates are removed and W ( , ) is the remaining,best candidate. We report experimental evaluation results in this section. In additionto evaluating our own optimization techniques, we further comparethem with two representative window slicing techniques, panedwindow [30] and paired windows [29].
We present a brief overview of paned window and paired win-dows. Both are special cases of sliced windows . A sliced window Z (with respect to a window W ⟨ r , s ⟩ ) that has m slices is de-noted by Z ( z , ..., z m ) . We say that Z has | Z | slices and a period z = s = (cid:205) mi = z i , and that each slice z i has an edge e i = (cid:205) ij = z j (ref.Definition 2 of [29]). With this notation, paned window and pairedwindows of a window W ⟨ r , s ⟩ can be defined as follows: • Paned — X ( д , ..., д m ) , where д = · · · = д m = д is the greatestcommon divisor (GCD) of r and s , and m = s / д ; • Paired — Y ( z , z ) , where z = r mod s and z = s − s .Given multiple windows, one has two options when exploitingwindow slicing: • Unshared : We process each window separately using paned/-paired windows. For each window, input events are replicatedto generate partial aggregates on top of its own slices; these par-tial aggregates are then combined to generate the output of thewindow via a final aggregate . • Shared : We compose sliced windows of each window into asingle common sliced window. Input events are then sent tothis shared sliced window to generate partial aggregates . Eachwindow generates its output via a final aggregate by combiningpartial aggregates from relevant slices.Krishnamurthy et al. further proposed a technique that combinesthe slices from multiple windows [29]. Given a set of n windows W = { W i ⟨ r i , s i ⟩} ni = , Table 1 summarizes the costs of computing W during a period S = lcm { s , ..., s n } , using the above windowslicing techniques. Here, T is the number of events processed in theperiod S . For a steady input event rate η , T = η · S . E is the numberof slices, and thus the number of partial aggregates, in the commonsliced window with period S . See [29] for the derivation of the costformulas in Table 1. Since their composition technique is optimal(see Theorem 1 of [29]), no other window slicing technique canlead to lower cost when enabling sharing of the slices. entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou Technique Partial Final
Unshared paned nT (cid:205) ni = ( S / s i ) · ( r / д ) Unshared paired nT (cid:205) ni = ( S / s i ) · ⌈ r i / s i ⌉ Shared paned T (cid:205) ni = E paned · ( r i / s i ) Shared paired T (cid:205) ni = E paired · ( r i / s i ) Table 1: Costs of window slicing techniques.Algorithm 5:
Random window generator.
Input: s min , the minimum; s max , the maximum slide; k max , themaximum ratio of range over slide. Output: W ⟨ r , s ⟩ , the window generated. s ← Random ( s min , s max ) ; r ← Random ({ s , s , ..., k max s }) ; return the result window W ⟨ r , s ⟩ ; Algorithm 6:
Random DAG generator.
Input: L , the number of levels in the DAG; B , the number ofwindows at the base level; ∆ , the number of windowsto be increased per level; p , the probability of addingan edge between two windows across two consecutivelevels. Output: G , the DAG generated. // Generate the base level L ; L ← ∅ ; while B > do W ← RandomWindow ( , s max , k max ) by Algorithm 5; if W is not covered by windows in L then L ← L ∪ { W } ; B ← B − // Generate the remaining levels; for ≤ l ≤ L do L l ← ∅ ; C ← B + ∆ · l ; while C > do S ←
RandomSubset (L l − , p ) ; s min ← lcm { s : W ⟨ r , s ⟩ ∈ S} ; W ← RandomWindow ( s min , s max , k max ) byAlgorithm 5; if W is not covered by windows in L l then L l ← L l ∪ { W } ; C ← C − return the result graph G ; Data Generation.
We generated synthetic data with various in-put event rate η ∈ [ , η max ] where η max is a parameter of the data To double-check, we can directly compare the benefits of the three candidates. Onecan compute that (1) W ( , ) leads to the same cost when considering the patternin Figure 9 including W and W ; (2) W ( , ) leads to the cost , a reduction of 30%;and (3) W ( , ) leads to the cost , a reduction of 37.5%. (a) |W | = , η = (b) |W | = , η = (c) |W | = , η = Figure 11: Costs of general window sets generated by Ran-domGen. y -axis is at logarithmic scale. generator that represents the maximum event rate. We further gen-erated window sets using the following approaches with varyingdegrees of window correlations: • RandomGen : We generate each window W ⟨ r , s ⟩ following Al-gorithm 5, by picking s randomly from [ , s max ] and r randomlyfrom { s , s , ..., k max s } , where s max and k max are integers. • ChainGen : We generate random windows W , ..., W n sequen-tially following Algorithm 5 with the constraint that W i + iscovered by W i for all 1 ≤ i ≤ n − • StarGen : We generate random windows W , ..., W n sequentiallyfollowing Algorithm 5 with the constraint that each W i with2 ≤ i ≤ n is covered by the window W . • RandomGraphGen : We generate random graphs that capturethe coverage relationships between windows. The graphs areDAGs that group windows into levels, each consisting of windowsthat are not covered by each other. We use Algorithm 6 to con-struct the DAGs bottom-up. Its function RandomSubset (L l − , p ) ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows (a) |W | = , η = (b) |W | = , η = (c) |W | = , η = Figure 12: Costs of tumbling window sets by RandomGen. generates a subset of L l − by randomly choosing windows from L l − with probability p . Algorithm 6 linearly increases the num-ber of windows as levels go up, but other growth policies arepossible with different growth rates (e.g., constant growth , super-linear growth , and sub-linear growth ).We generate 10 random window sets with 5 and 10 windows usingeach of RandomGen , ChainGen , and
StarGen , respectively. Wefurther generate 10 random window sets using
RandomGraph-Gen by having the base level contain 2 windows, with 3 levelsin total, each upper level increasing the number of windows by2. That is, the generated DAGs contain 3 levels, with 2, 4, and 6windows in each level from bottom to top. While we could havegenerated larger window sets, we intentionally did not do that aswe do not expect people to write queries containing more than 10windows. Moreover, to further experiment with the special case of “partitioned by” semantics, we generated variants of the abovewindow sets with only tumbling windows.
Participating Techniques.
We compare our WCG-based tech-niques, with and without factor windows, with window slicingtechniques presented in Section 5.1. Since paired windows cannever be worse than paned window [29], we only compare withpaired windows. Moreover, as a baseline, we compare with directevaluation of window set without using WCG. To summarize, wehave five participating techniques in our comparative study: • Unshared Paired (UP) , which uses paired windows foreach window without sharing; • Shared Paired (SP) , which composes common paired win-dows that are shared by the window set; • WCG Basic (WCG) , which constructs WCG on top of thewindow set; • WCG with Factor Windows (WCG-FW) , which con-structs WCG containing factor windows; • Baseline (BL) , which directly computes the windows with-out window slicing or sharing.
Evaluation Metrics.
We compare the computation costs of theparticipating techniques on a given window set. One problem isthat the period S used by the cost models of UP and SP are differentfrom the period R in this paper for BL , WCG , and
WCG-FW . Oursolution is to extend both periods to their least common multiple,i.e., S ′ = R ′ = lcm { S , R } , and consider the total computation costof each technique during this extended period. We only report results on window sets generated by
RandomGen , ChainGen , and
StarGen with five windows. The observationsover the results on window sets generated by these generators withten windows are very similar and therefore omitted.In each of the Figures 11 to 15 that present the evaluation results,the x -axis represents the ten randomly-generated window sets inthe corresponding experiment, and the y -axis represents the costsof different participating techniques being compared. Random Window Sets.
Figure 11 presents costs of the windowsets generated by
RandomGen with varying input event rates.Figure 12 further presents results using only tumbling windows(for the “partitioned by” semantics). We have several observationsbased on the results: • Baseline Approach – The baseline approach ( BL ) performs theworst overall, but sometimes is comparable to UP and WCG . • Window Slicing Techniques – Unshared paired windows ( UP )significantly outperforms BL on general windows that are notnecessarily tumbling (note that the y -axis is at logarithmic scalein Figure 11). However, for tumbling windows, it performs thesame as or even worse than BL (Figure 12). On the other hand,shared paired windows ( SP ) can improve over UP by more than10 × , regardless of tumbling or general window sets. • WCG Techniques – The WCG-based technique without usingfactor windows (
WCG ) is not very effective over general win-dow sets (Figure 11), improving over BL in a couple of cases. Itimproves over tumbling window sets (Figure 12), outperform-ing both BL and UP dramatically. By allowing factor windows, WCG-FW further improves over
WCG significantly, regardless entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou (a) |W | = , η = , General (b) |W | = , η = , Tumbling Figure 13: Costs of window sets by ChainGen. of general or tumbling window sets. In fact,
WCG-FW exhibitsperformance similar to SP , which is presumably the state-of-the-art window slicing technique.The above observations become more stable as we increase the inputevent rate η . In practice, we expect medium to high event ingestionrates in most streaming applications that require distributed streamprocessing. As a result, we focus on cases with η =
100 in ourfollowing experimental results.
Chain/Star Graphs.
Figure 13 presents results on window setsgenerated by
ChainGen . On general window sets, while the per-formance of
WCG is between that of UP and BL , WCG-FW cansignificantly reduce the cost of
WCG by inducing factor windowsand brings the cost back to the level of SP . On tumbling windowsets, WCG performs almost the same as that of
WCG-FW and SP ,implying that factor windows are not necessary in this particularcase. Figure 14 further presents results on window sets generatedby StarGen and we have the same observations.
Random Graphs.
Finally, Figure 15 presents results on windowsets generated by
RandomGraphGen . Again, BL and UP remainthe worst among the evaluated techniques. Meanwhile, WCG-FW is no worse than
WCG , and sometimes can achieve the same per-formance as that of SP . The related work on stream query processing and optimization isoverwhelming (see [26] for a survey). We focus our discussion onoptimization techniques dedicated to window aggregates [15, 31].One prominent line of work in the literature is the class of windowslicing techniques (e.g. [29, 30]). The general idea is to chop the en-tire window into smaller chunks, and then compute the aggregate (a) |W | = , η = , General (b) |W | = , η = , Tumbling Figure 14: Costs of window sets by StarGen. (a) |W | = , η = , General (b) |W | = , η = , Tumbling Figure 15: Costs of window sets by RandomGraphGen. over the whole window by aggregating sub-aggregates over thesmall chunks. Unlike window slicing, we do not proactively chopa window. Instead, we exploit the internal overlapping relation-ships between correlated windows, which are ignored by window ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows slicing techniques. As we showed in our experimental evaluation(Section 5), utilizing such overlappings can sometimes outperformwhile in general are comparable to state-of-the-art window slicingtechniques, especially when factor windows are enabled.In addition to either sharing window slices among all windowsor not sharing them at all, as explored by Krishnamurthy et al. [29],Guirguis et al. considered variants that divide windows into groupsand only share window slices within each group [24, 25]. They lever-aged similar cost models as the ones proposed by Krishnamurthyet al. [29], and proposed optimization techniques that search forthe grouping of queries that leads to the minimum overall cost.Like [29], the techniques require changes to the execution run-time and hence entail more implementation complexity than ourapproach, which is done at query rewriting level .There are other techniques in addition to window slicing, suchas Cutty [16] and Reactive Aggregator [38], which were designedfor more general types of windows beyond tumbling/hopping win-dows and more general classes of aggregates, such as user-definedfunctions. Recently, Traub et al. further proposed a framework thatunifies these aforementioned techniques [41]. Again, none of themconsiders the inherent overlap between windows. As future work,it would be interesting to extend our techniques to other types ofwindows and/or aggregate functions. In addition, there has beena flurry of recent work that accelerates window aggregation viabetter utilization of modern hardware, such as Grizzly [23] andLightSaber [39]. This line of work is orthogonal to ours. However,it may be worthwhile to consider combining it with our cost-basedoptimization framework.Cost-based query optimization is the standard practice in batchprocessing systems [37], but is not popular in stream processingsystems. There is little work on cost modeling in the streamingworld [43]. One reason might be the difficulty of defining a singlecost criterion, as streaming systems may need to honor variousperformance metrics simultaneously, such as latency, throughput,and resource utilization [18]. Although the application of static cost-based query optimization is limited [9], dynamic query optimization(a.k.a., adaptive query processing) at runtime has been extensivelystudied in the context of streaming (e.g., [8, 12, 20, 21, 32–35, 42]).Our current cost model is static and it is interesting future work toinvestigate how to dynamically adjust cost estimates at runtime bykeeping track of the input event rates.In recent years, a number of distributed streaming systems havebeen built as open-source or proprietary software (e.g., Storm [40],Spark Streaming [7], Flink [14], MillWheel [4], Dataflow [5],Quill [17], etc.). While most of these systems provide users with imperative programming interfaces, the adoption of declarative ,SQL-like query interfaces [6], similar to the one that ASA exposes,has been increasingly popular. For example, both Spark Stream-ing and Flink now support SQL queries on top of data streams.Moving to the declarative interface raises the level of abstractionand enables compile-time query optimization. The optimizationtechniques proposed in this paper can be implemented in either im-perative or declarative systems. We demonstrated the latter for theASA SQL query compiler (Section 3.3), but our algorithms are nottied to the ASA SQL language and can be applied in other streamingsystems that support declarative query languages. We proposed a cost-based optimization framework to optimize theevaluation of aggregate function over multiple correlated windows.It leverages the window coverage graph that we introduced to cap-ture the inherent overlapping relationships between windows. Weintroduced factor windows into the window coverage graph to helpreduce the overall computation overhead. Evaluation results showthat our optimization framework can achieve comparable perfor-mance with respect to state-of-the-art window slicing techniques,especially when factor windows are enabled, without the need forruntime support from stream processing engines.
REFERENCES [1] [n.d.]. Amazon Kinesis. https://aws.amazon.com/kinesis/.[2] [n.d.]. Azure Stream Analytics. https://azure.microsoft.com/en-us/services/stream-analytics/.[3] [n.d.]. Google Cloud Dataflow. https://cloud.google.com/dataflow/.[4] Tyler Akidau, Alex Balikov, Kaya Bekiroglu, Slava Chernyak, Josh Haberman,Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, and Sam Whittle. 2013.MillWheel: Fault-Tolerant Stream Processing at Internet Scale.
PVLDB
6, 11(2013), 1033–1044.[5] Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, RafaelFernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry,Eric Schmidt, and Sam Whittle. 2015. The Dataflow Model: A Practical Approachto Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing.
PVLDB
8, 12 (2015), 1792–1803.[6] Arvind Arasu, Shivnath Babu, and Jennifer Widom. 2006. The CQL continuousquery language: semantic foundations and query execution.
VLDB J.
15, 2 (2006),121–142.[7] Michael Armbrust, Tathagata Das, Joseph Torres, Burak Yavuz, Shixiong Zhu,Reynold Xin, Ali Ghodsi, Ion Stoica, and Matei Zaharia. 2018. Structured Stream-ing: A Declarative API for Real-Time Applications in Apache Spark. In
SIGMOD .601–613.[8] Ron Avnur and Joseph M. Hellerstein. 2000. Eddies: Continuously AdaptiveQuery Processing. In
SIGMOD . 261–272.[9] Ahmed Ayad and Jeffrey F. Naughton. 2004. Static Optimization of ConjunctiveQueries with Sliding Windows Over Infinite Streams. In
SIGMOD . 419–430.[10] Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng Hong. 2007.Consistent Streaming Through Time: A Vision for Event Stream Processing. In
CIDR . 363–374.[11] Lawrence Benson, Philipp M. Grulich, Steffen Zeuch, Volker Markl, and TilmannRabl. 2020. Disco: Efficient Distributed Window Aggregation. In
EDBT . 423–426.[12] Philip A. Bernstein, Todd Porter, Rahul Potharaju, Alejandro Z. Tomsic, ShivaramVenkataraman, and Wentao Wu. 2019. Serverless Event-Stream Processing overVirtual Actors. In
CIDR .[13] Jaroslaw Byrka, Fabrizio Grandoni, Thomas Rothvoß, and Laura Sanità. 2010. Animproved LP-based approximation for steiner tree. In
STOC . 583–592.[14] Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi,and Kostas Tzoumas. 2015. Apache Flink™: Stream and Batch Processing in aSingle Engine.
IEEE Data Eng. Bull.
38, 4 (2015), 28–38.[15] Paris Carbone, Asterios Katsifodimos, and Seif Haridi. 2019. Stream Window Ag-gregation Semantics and Optimization. In
Encyclopedia of Big Data Technologies ,Sherif Sakr and Albert Y. Zomaya (Eds.). Springer.[16] Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, and Volker Markl.2016. Cutty: Aggregate Sharing for User-Defined Windows. In
CIKM . 1201–1210.[17] Badrish Chandramouli, Raul Castro Fernandez, Jonathan Goldstein, Ahmed El-dawy, and Abdul Quamar. 2016. Quill: Efficient, Transferable, and Rich Analyticsat Scale.
PVLDB
9, 14 (2016), 1623–1634.[18] Badrish Chandramouli, Jonathan Goldstein, Roger S. Barga, Mirek Riedewald, andIvo Santos. 2011. Accurate latency estimation in a distributed event processingsystem. In
ICDE . 255–266.[19] Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert DeLine, John C.Platt, James F. Terwilliger, and John Wernsing. 2014. Trill: A High-PerformanceIncremental Query Processor for Diverse Analytics.
PVLDB
8, 4 (2014), 401–412.[20] Amol Deshpande and Joseph M. Hellerstein. 2004. Lifting the Burden of Historyfrom Adaptive Query Processing. In
VLDB . 948–959.[21] Avrilia Floratou, Ashvin Agrawal, Bill Graham, Sriram Rao, and Karthik Ra-masamy. 2017. Dhalion: Self-Regulating Stream Processing in Heron.
PVLDB entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou
Data Min. Knowl. Discov.
1, 1 (1997), 29–53.[23] Philipp M. Grulich, Sebastian Breß, Steffen Zeuch, Jonas Traub, Janis von Ble-ichert, Zongxiong Chen, Tilmann Rabl, and Volker Markl. [n.d.]. Grizzly: EfficientStream Processing Through Adaptive Query Compilation. In
SIGMOD . 2487–2503.[24] Shenoda Guirguis, Mohamed A. Sharaf, Panos K. Chrysanthis, and AlexandrosLabrinidis. 2011. Optimized processing of multiple aggregate continuous queries.In
CIKM . 1515–1524.[25] Shenoda Guirguis, Mohamed A. Sharaf, Panos K. Chrysanthis, and AlexandrosLabrinidis. 2012. Three-Level Processing of Multiple Aggregate ContinuousQueries. In
ICDE . 929–940.[26] Martin Hirzel, Robert Soulé, Scott Schneider, Bugra Gedik, and Robert Grimm.2013. A catalog of stream processing optimizations.
ACM Comput. Surv.
46, 4(2013), 46:1–46:34.[27] Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, HenriHeiskanen, and Volker Markl. 2018. Benchmarking Distributed Stream DataProcessing Systems. In
ICDE . 1507–1518.[28] Richard M. Karp. 1972. Reducibility Among Combinatorial Problems. In
Com-plexity of Computer Computations . 85–103.[29] Sailesh Krishnamurthy, Chung Wu, and Michael J. Franklin. 2006. On-the-flysharing for streamed aggregation. In
SIGMOD . 623–634.[30] Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005.No pane, no gain: efficient evaluation of sliding-window aggregates over datastreams.
SIGMOD Record
34, 1 (2005), 39–44.[31] Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005.Semantics and Evaluation Techniques for Window Aggregates in Data Streams.In
SIGMOD . 311–322.[32] Luo Mai, Kai Zeng, Rahul Potharaju, Le Xu, Steve Suh, Shivaram Venkataraman,Paolo Costa, Terry Kim, Saravanam Muthukrishnan, Vamsi Kuppa, SudheerDhulipalla, and Sriram Rao. 2018. Chi: A Scalable and Programmable ControlPlane for Distributed Stream Processing Systems.
PVLDB
11, 10 (2018), 1303–1316.[33] Rimma V. Nehme, Elke A. Rundensteiner, and Elisa Bertino. 2009. Self-tuningquery mesh for adaptive multi-route query processing. In
EDBT . 803–814.[34] Rimma V. Nehme, Karen Works, Chuan Lei, Elke A. Rundensteiner, and ElisaBertino. 2013. Multi-route query processing and optimization.
J. Comput. Syst.Sci.
79, 3 (2013), 312–329.[35] Vijayshankar Raman, Amol Deshpande, and Joseph M. Hellerstein. 2003. UsingState Modules for Adaptive Query Processing. In
ICDE . 353–364.[36] Gabriel Robins and Alexander Zelikovsky. 2000. Improved Steiner tree approxi-mation in graphs. In
SODA . 770–779.[37] Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A.Lorie, and Thomas G. Price. 1979. Access Path Selection in a Relational DatabaseManagement System. In
SIGMOD . 23–34.[38] Kanat Tangwongsan, Martin Hirzel, Scott Schneider, and Kun-Lung Wu. 2015.General Incremental Sliding-Window Aggregation.
PVLDB
8, 7 (2015), 702–713.[39] Georgios Theodorakis, Alexandros Koliousis, Peter R. Pietzuch, and Holger Pirk.[n.d.]. LightSaber: Efficient Window Aggregation on Multi-core Processors. In
SIGMOD . 2505–2521.[40] Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthikeyan Ramasamy, Jig-nesh M. Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu,Jake Donham, Nikunj Bhagat, Sailesh Mittal, and Dmitriy V. Ryaboy. 2014.Storm@twitter. In
SIGMOD . 147–156.[41] Jonas Traub, Philipp M. Grulich, Alejandro Rodriguez Cuellar, Sebastian Breß,Asterios Katsifodimos, Tilmann Rabl, and Volker Markl. 2019. Efficient WindowAggregation with General Stream Slicing. In
EDBT . 97–108.[42] Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, AliGhodsi, Michael J. Franklin, Benjamin Recht, and Ion Stoica. 2017. Drizzle: Fastand Adaptable Stream Processing at Scale. In
SOSP . 374–389.[43] Stratis Viglas and Jeffrey F. Naughton. 2002. Rate-based query optimization forstreaming information sources. In
SIGMOD . 37–48.[44] Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, andIon Stoica. 2013. Discretized streams: fault-tolerant streaming computation atscale. In
SOSP . 423–438.
A PROOFSA.1 Proof of Theorem 1
Proof. Consider an arbitrary interval I = [ a , b ) ∈ W . By theinterval representation of W , we have a = m · s and b = m · s + r for some integer m ≥
0. (1)
The “if” part ⇒ : Since s is a multiple of s , we have s = k · s for some integer k ≥
1. As a result, m · s = m · k · s = ( m · k ) · s . Similarly, since δ r = r − r is a multiple of s , r − r = k ′ · s for some integer k ′ ≥
1. As a result, m · s + r = ( m · k ) · s + k ′ · s + r = ( m · k + k ′ ) · s + r . Set m = m · k and m ′ = m · k + k ′ . Now consider twointervals I a = [ a , x ) = [ m · s , m · s + r ) and I b = [ y , b ) = [ m ′ · s , m ′ · s + r ) that belong to W . Clearly, we have m · s = m · s = a and m ′ · s + r = m · s + r = b . Moreover, since m ′ > m , we have x = m · s + r < b and y = m ′ · s > a . Therefore, W is covered by W , byDefinition 1.(2) The “only if” part ⇐ : Since W is covered by W , by Defi-nition 1 there exist two intervals I a = [ a , x ) and I b = [ y , b ) in W such that x < b and y > a . As a result, there is some m ≥ m · s = a = m · s . That is, m = m · ( s / s ) . Since both m and m are integers, s / s is also an integer.As a result, s must be a multiple of s .On the other hand, similarly there is some m ′ > m suchthat m ′ · s + r = b = m · s + r . We then have m ′ · s + r = m · s + r , which yields m ′ = m + ( r − r )/ s . Since both m ′ and m are integers, ( r − r )/ s must be aninteger. Hence, δ r = r − r is a multiple of s .This completes the proof of the theorem. □ A.2 Proof of Theorem 2
Proof. We prove the three properties one by one.(1)
Reflexivity : Clearly, by Definition 1 a window W is coveredby itself.(2) Antisymmetry : Suppose that W ≤ W and W ≤ W .Consider an arbitrary interval [ a , b ) contained by W . Since W ≤ W , there exist two intervals I x = [ a , x ) and I y = [ y , b ) in W . On the other hand, since W ≤ W , for I x there existintervals I x ′ = [ a , x ′ ) and I x ′′ = [ x ′′ , x ) in W . Since no twointervals in a window start from the same time point butend at different time points, we conclude that x ′ = b . Since x ′ ≤ x ≤ b by Definition 1, we have x = x ′ = x ′′ = b . Using similar arguments we can show that y = y ′ = y ′′ = a .As a result, we have proved that W = W . ost-based Query Rewriting Techniques for Optimizing Aggregates Over Correlated Windows (3) Transitivity : Suppose that W ≤ W and W ≤ W . Again,consider an arbitrary interval [ a , b ) in W . Since W ≤ W ,there exist two intervals I x = [ a , x ) and I y = [ y , b ) in W .Moreover, since W ≤ W , there exist two intervals I x ′ = [ a , x ′ ) and I x ′′ = [ x ′′ , x ) in W , and there also exist twointervals I y ′ = [ y , y ′ ) and I y ′′ = [ y ′′ , b ) in W . Now consider I x ′ and I y ′′ . By Definition 1, we have x ′ ≤ x ≤ b and y ′′ ≥ y ≥ a . Since [ a , b ) is an arbitrary interval in W , it followsthat W ≤ W .This completes the proof of the theorem. □ A.3 Proof of Theorem 3
Proof. If we take a union of the intervals in I a , b , it is easy tosee I = ∪ J ∈I a , b J . By Definition 1, we can further enumerate theintervals in I a , b as J = [ x , y ) , ..., J n = [ x n , y n ) such that x = a , y n = b , and x < · · · < x n , where n = |I a , b | . Therefore, I = J ∪ ( J − J ) ∪ · · · ∪ ( J n − J n − ) . Since the intervals J , J − J , ..., J n − J n − are mutually exclusive,it follows that | I | = | J | + | J − J | + · · · + | J n − J n − | . We have | I | = r , | J | = r , and | J k − J k − | = s for 2 ≤ k ≤ n . As aresult, r = r + ( n − ) · s , which yields M ( W , W ) = n = + ( r − r )/ s . This completes the proof of the theorem. □ A.4 Proof of Theorem 4
Proof. We prove each direction separately.(a)
The “if” part ⇒ : Suppose that conditions (1) to (3) hold. By(2) and (3), we know that r − r must be a multiple of s either. Combining with (1), W is covered by W accordingto Theorem 1. Now consider an arbitrary interval I in W .Let the covering set of I in W be I . We next show that I is disjoint. By (2) and (3) we know that r is a multiple of r . As a result, r = k · r where k is an integer. To showthat I is disjoint we only need to show that |I| = k (recallFigure 4(a)). We have |I| = + ( r − r )/ s , [by Theorem 3] = + ( k · r − r )/ s , [by Condition (2)] = + ( k − ) , [by Condition (3)] = k . (b) The “only if” part ⇐ : Suppose that W is partitioned by W . By Theorem 1, condition (1) holds. Again, consider anarbitrary interval I in W and let its covering set in W be I .We know that I is disjoint, which implies condition (3), i.e., r = s , as well as that r must be a multiple of r . Therefore, r must also be a multiple of s and condition (2) holds.This completes the proof of the theorem. □ A.5 Proof of Theorem 6
Proof. We only prove
MIN is distributive over overlapping par-titions, as the proof for
MAX is very similar. We set both f and д in the definition of distributive aggregate function as MIN . It is easy to see that, if two sets S and S satisfying S ⊆ S , then MIN ( S ) ≤ MIN ( S ) . Moreover, for any set S , MIN ( S ) ∈ S and thus { MIN ( S )} ⊆ S . Therefore, S = { MIN ( T ) , ..., MIN ( T n )} ⊆ T ∪ · · · ∪ T n , since MIN ( T ) ⊆ T , ..., MIN ( T n ) ⊆ T n . As a result, MIN ( T ) ≤ MIN ( S ) = MIN ({ MIN ( T ) , ..., MIN ( T n )}) . We now prove that
MIN ( S ) ≤ MIN ( T ) . To see this, let S = T , S = T − T , S = T − ( S ∪ S ) ,... S n = T n − ( S ∪ · · · ∪ S n − ) . We have T = S ∪ · · · ∪ S n , and S i ∩ S j = ∅ for all 1 ≤ i , j ≤ n .Therefore, MIN ( T ) = MIN ( S ∪· · ·∪ S n ) . Moreover, there exists some j such that MIN ( S j ) = MIN ( T ) . Since S j ⊆ T j , MIN ( S j ) ≥ MIN ( T j ) . Asa result, MIN ( T ) = MIN ({ MIN ( S ) , ..., MIN ( S n )})≥ MIN ({ MIN ( T ) , ..., MIN ( T n )}) = MIN ( S ) . Since we have proved both
MIN ( S ) ≤ MIN ( T ) and MIN ( T ) ≤ MIN ( S ) , it must hold that MIN ( S ) = MIN ( T ) . □ A.6 Proof of Theorem 8
Since both W f and W in Figure 9 are now tumbling windows, k f = k W =
1. Equation 3 then yields (cid:213) Kj = n j n f (cid:16) r j s f − r j s W (cid:17) + r f s W ≤ . Since r f = s f and r W = s W , it follows that (cid:213) Kj = n j n f (cid:16) ρ j − r j r W (cid:17) + r f r W ≤ . Since r f = r j ρ j by definition, we have r j = ρ j r f and thus (cid:213) Kj = n j ρ j n f (cid:16) − r f r W (cid:17) + r f r W ≤ . (6)Moreover, by definition of n f (Equation 1) we have n f = ( m f − ) k f + = m f = Rr f = Rρ j r j = m j ρ j . Substituting into Equation 6, it follows that (cid:16) − r f r W (cid:17) · λ + r f r W ≤ , (7)where λ has been defined in Equation 4. As a result, we have r f r W ≥ λλ − . (8)Since n j = ( m j − ) k j + ≥ m j , by Equation 4 we have λ ≥ K .We distinguish two cases: K ≥ K = We treat each element in T differently, even if some of them may have the same datavalue.15 entao Wu, Philip A. Bernstein, Alex Raizman, and Christina Pavlopoulou The Case of K ≥ . When K ≥ λλ − ≤ KK − ≤ . Since r f r W ≥
2, Equation 8 holds, which implies c ≤ c ′ . Note thatthe equality c = c ′ only holds when r f = r W and λ = K = n j = m j for j = ,
2. In this case, both downstreamwindows of W (and thus W f ) are tumbling, and W f exactly doublesthe range of W , which is a very special case. The Case of K = . When K = λ = n m . We distinguish twosituations: • If k =
1, which means that the (unique) downstream win-dow is tumbling, then n = m and thus λ =
1. Equation 7then implies that 1 ≤
0, which is impossible. As a result, c ≤ c ′ does not hold. • If k >
1, then λ > m >
1, since if m = n = ( m − ) k + = λ =
1, a contradiction.Substituting λ = n m , we obtain λλ − = + m n − m = + m ( m − )( k − ) = + k − + ( m − )( k − ) . As a result, when k ≥ m ≥ λλ − ≤ + + < , and thus Equation 8 holds without equality as r f ≥ r W ,which implies c < c ′ . For the other two special cases whereone of k and m is 2 and the other is 3, we have to comparethe LHS and RHS to determine whether Equation 8 holds. A.7 Proof of Theorem 9
Let d = c f − c ′ f . It then follows that d = (cid:213) Kj = n j (cid:16) M ( W j , W f ) − M ( W j , W ′ f ) (cid:17) + ∆ (9) = (cid:213) Kj = n j (cid:16) r j − r f s f − r j − r ′ f s ′ f (cid:17) + ∆ = (cid:213) Kj = n j (cid:16) r j s f − k f − r j s ′ f + k ′ f (cid:17) + ∆ , where ∆ = n f · M ( W f , W ) − n ′ f · M ( W ′ f , W ) = n f (cid:16) + r f − r W s W (cid:17) − n ′ f (cid:16) + r ′ f − r W s W (cid:17) = n f (cid:16) + r f s W − k W (cid:17) − n ′ f (cid:16) + r ′ f s W − k W (cid:17) . Clearly, W f is more beneficial if d <
0. Proof. Since W f , W ′ f , and W are all tumbling windows, k f = k ′ f = k W =
1. Substituting into Equation 9 and using the facts r f = s f , r ′ f = s ′ f , and r W = s W yields c f − c ′ f = (cid:213) Kj = n j (cid:16) r j r f − r j r ′ f (cid:17) + n f · r f r W − n ′ f · r ′ f r W = n f (cid:16) (cid:213) Kj = n j n f (cid:16) r j r f − r j r ′ f (cid:17) + r f r W − n ′ f n f · r ′ f r W (cid:17) . Again we consider when c f ≤ c ′ f holds. Or equivalently, (cid:213) Kj = n j n f (cid:16) r j r f − r j r ′ f (cid:17) + r f r W − n ′ f n f · r ′ f r W ≤ . Similarly, define ρ j = r j r f , ρ ′ j = r j r ′ f , ∀ ≤ j ≤ K . Since W f is tumbling, n f = m f = Rr f = m j r j r f = m j ρ j . It therefore follows that (cid:213) Kj = n j m j ρ j ( ρ j − ρ ′ j ) + r f r W − n ′ f n f · r ′ f r W ≤ . Noting that ρ ′ j ρ j = r j / r ′ f r j / r f = r f r ′ f and making some rearrangement of the terms yields (cid:16) − r f r ′ f (cid:17) (cid:213) Kj = n j m j + r ′ f r W (cid:16) r f r ′ f − n ′ f n f (cid:17) ≤ . As before, define λ = (cid:205) Kj = n j m j . It then follows that r f r ′ f ≥ λ − r ′ f r W · n ′ f n f λ − r ′ f r W . Moreover, since both W f and W ′ f are tumbling windows, we have n f = m f and n ′ f = m ′ f . Therefore, r ′ f r W · n ′ f n f = r ′ f r W · m ′ f m f = Rr W m f = r f r W , which yields r f r ′ f ≥ λ − r f r W λ − r ′ f r W . This completes the proof of the theorem. □□