Scalable and Reliable Multi-Dimensional Aggregation of Sensor Data Streams
SScalable and Reliable Multi-DimensionalAggregation of Sensor Data Streams
S¨oren Henning ∗ , Wilhelm Hasselbring † Software Engineering Group, Kiel University, 24098 Kiel ∗ [email protected], † [email protected] Abstract —Ever-increasing amounts of data and requirementsto process them in real time lead to more and more analyticsplatforms and software systems being designed according to theconcept of stream processing. A common area of application is theprocessing of continuous data streams from sensors, for example,IoT devices or performance monitoring tools. In addition toanalyzing pure sensor data, analyses of data for groups of sensorsoften need to be performed as well. Therefore, data streams ofthe individual sensors have to be continuously aggregated to adata stream for a group.Motivated by a real-world application scenario, we proposethat such a stream aggregation approach has to allow foraggregating sensors in hierarchical groups, support multiplesuch hierarchies in parallel, provide reconfiguration at runtime,and preserve the scalability and reliability qualities induced byapplying stream processing techniques. We propose a streamprocessing architecture fulfilling these requirements, which canbe integrated into existing big data architectures. We presenta pilot implementation of such an extended architecture andshow how it is used in industry. Furthermore, in experimentalevaluations we show that our solution scales linearly with theamount of sensors and provides adequate reliability in the caseof faults.
I. I
NTRODUCTION
Stream processing [1], [2] has evolved as a paradigm toprocess and analyze continuous streams of data, for example,coming from IoT sensors. The rapid development of streamprocessing engines [3] over the last years has paved theway for applications that process data exclusively online, i.e.,as soon as it is recorded. Whereas a couple of years agoLambda architectures were the de-facto standard for analyticsplatforms, currently more and more platforms follow theKappa architecture pattern, where data is exclusively processedonline [4]. Further, entire software system architectures followpatterns such as asynchronously communicating microservices[5] and event sourcing [6], which require data to be available ascontinuous streams instead of actively polled from databases.When considering continuous streams of measurement data,for example, from physical IoT sensors or software perfor-mance metrics, often an aggregation of multiple such streamsis required. Whereas in the traditional approach of first writingall measurements to a (relational) database and then queryingthis database, this is a well-known task, performing such anaggregation on continuous data streams poses difficulties. This is in particular true when requirements for scalability andreliability have to be considered.In this paper, we contribute to the seminal work on streamprocessing by presenting and evaluating an approach to ag-gregate data of multiple streams in real time. The remainingpaper is structured as follows: Section II motivates the demandfor our aggregation approach by an example for IoT sensordata streams. Section III derives essential requirements for anaggregation approach. Section IV presents our Titan streamprocessing architecture for aggregating data streams. Section Vshows how we implement this architecture in an industry-applied IoT monitoring platform. Section VI evaluates ourproposed architecture in terms of scalability and reliability.Section VII discusses limitations of our approach and possibleextensions to overcome them. Finally, Section VIII discussesrelated work and Section IX concludes this paper.II. M
OTIVATING E XAMPLE
Operators of industrial production environments have highinterest in getting detailed insights into the resource usage ofmachines and production processes. Those insights may revealoptimization potential, provide reporting for stakeholders, andcan be used for predictive maintenance. Today’s industrial pro-duction environments operate a variety of network-compatiblemeasuring instruments. Such instruments (sensors) continu-ously measure, inter alia, the resource usage of individualmachines, for example, their electrical power consumption.They may publish these metrics via a messaging system,allowing real-time analytics systems to collect, process, store,and visualize those data.In particular, but not exclusively in the case of electricalpower consumption, it is not only producing machinery thatuses resources but also other company areas such as ITinfrastructure, employee offices, or building technology. Thisleads to the situation that the amount of resource consumingdevices is often immense, which makes it difficult to assess.Therefore, metrics for groups of machines in addition toconsumption metrics of the individual machines are oftenrequired. Consequently, the data streams of the individual sen-sors have to be continuously aggregated. Referring to Fig. 1,which exemplary shows a production environment comprisingvarious machines and devices, operators may require to answerquestions such as: What is the resource usage of a certainmachine type (e.g., the total power consumption of all turningshops)? What is the resource usage by business unit (e.g., the a r X i v : . [ c s . D C ] N ov ighting Compressed Air Generation
Air
Conditioning
Turning ShopsBuilding A Compressed
Air Generation
LightingMilling Shops ITBuilding B
Fig. 1. Schematic illustration of a manufacturing company operating twobuildings and a wide range of power-consuming machinery and infrastructure.Operators may be, for instance, interested in the total power consumption ofall turning shops (red), directly required for production processes (blue), of acertain building (green), or of the entire company (yellow). total power consumption of producing machinery)? What isresource usage by physically collocated machines (e.g., thetotal power consumption per building or shop floor)? Whatis the overall company-wide resource usage? Further, formachines featuring multiple independent power supplies (e.g.,for redundancy), already obtaining consumption data for singlemachines requires to aggregate data streams of their individualpower supplies.III. R
EQUIREMENTS FOR S TREAM A GGREGATION
Even though we motivated the need for real-time aggrega-tion of data streams by resource data recorded by IoT devices,similar kinds of data aggregations are required by severalother types of continuous sensor data streams. Continuing ourmotivating example, we derive the following requirements fora data stream aggregation architecture:
1) Multi-Layer Aggregation:
Measurements of sensors areaggregated to groups of sensors. These groups can again beaggregated to larger groups and so forth. In the previousexample, these could first be groups of individual machinesof the same type, then groups of machines fulfilling the samefunction, then all machines used by the same production stepand, further, all machines in the entire production environment.
2) Multi-Hierarchy Aggregation:
In addition to a singlehierarchy as described above, it is likely that there is a demandfor supporting multiple such hierarchies. Referring to theprevious example, besides a hierarchy based on the purposeof machines, one may also need a hierarchy which representsthe physical location. For example, in a first step machines inthe same shop floor are grouped and then all shop floors in acertain building are grouped.
3) Hierarchy Reconfiguration at Runtime:
Stream process-ing applications are often characterized by demands for highavailability. Therefore, we search for an approach that allowsto modify or extend the previously described hierarchies atruntime to prevent reconfigurations causing downtimes.
4) Preserving Scalability and Reliability:
Stream process-ing is usually used for large volume of data where the loadhas to be handled by multiple CPU cores or even multiple computing nodes. Therefore, an approach that performs ag-gregations on these streams has to preserve scalability andreliability properties to not become the overall architecture’sbottleneck. IV. O UR T ITAN A PPROACH
In this section, we present a generic approach to hierarchi-cally aggregate streams of data. We start by a brief summaryof the dual streaming model, which provides the foundationof our approach. Based on this model, we then introduce theactual approach in terms of a topology architecture.
A. The Dual Streaming Model
The dual streaming model [7] is a model to define thesemantics of a stream processing architecture. It adopts thenotion of data streams and streaming operators from otherestablished stream processing models [8], [9], [10].A data stream is an append-only sequence of immutablerecords, where records are key-value pairs augmented by atimestamp. Key-value pairs allow for data parallelization asrecords with different keys can be processed in parallel. Thisis the fundamental idea of building highly scalable streamprocessing applications.
Streaming operators are functions applied to each recordof an input stream, whose results are appended to an outputstream. The number of output records may be zero, one, ormore than one depending on the type of operator. Usually,operators are distinguished between being stateless or stateful.Stateless operators produce an output solely based on thecurrently processed input record, whereas stateful operatorsmay also take previous input records and computations intoaccount. Successively connecting operator output streams withother operators allows to define complex stream processingtopology architectures, for example, to build big data analyticsapplications.The dual streaming model extends these models by consid-ering the result of streaming operators as successive updatesto a table. These updates may be materialized into a versionedtable or represented as a stream of insert, update, and deleteevents, inducing a duality of streams and tables.Sax et al. [7] present a reference implementation of the dualstreaming model called Kafka Streams, a stream processingframework build on top of the distributed messaging systemApache Kafka [11]. As in some cases the dual streamingmodel abstracts too many details to comprehensibly explainour topology, we use some architectural elements which areonly present in their reference implementation, but not in themodel.
B. Topology Architecture
We apply the dual streaming model to model the topologyof consecutive operations on the streaming data, which arerequired for aggregating sensor data. This model forces anunidirectional and side-effect-free description of the data flowand, thus, allows the scalability and reliability facilitated by themodel to be exploited. The architecture described in this way ast values join groupBy aggregate duplicate as flatMap merge aggregationresults sensor:
Fig. 2. Topology of our proposed stream processing architecture. Vertical cylinders represent tables whereas horizontal cyclinders represent data streams.Streaming operators are represented by rectangular boxes, which are connected to other operators, tables, or streams by directed arrows. Annotations atconnections, tables, and streams indicate the corresponding type of contained or transmitted data, where key and value type are separated by a colon. sensor represents a unique identifier for a sensor and group such an identifier for a sensor group, group[] represents a set of group identifiers, meas represent ameasurement, and aggr represent an aggregation result. Everything located inside the gray box corresponds to the part of our approach, which can be deployedas an individual microservice. The tables and streams placed outside can be considered as interfaces to other components. can be implemented as an encapsulated component, e.g., as amicroservice, which can be integrated into existing softwaresystems. Fig. 2 visualizes our proposed topology architecture.The individual processing steps are described in the following.
1) Data Sources:
Our proposed topology requires two datasources. The first one is an input stream of measurements,keyed by a sensor identifier. This is the sensor data streamas it comes, for example, from IoT devices or performancemonitoring tools. The second one is a table, mapping sensorsor sensor groups to the sensor groups containing them. A tableentry consists of a key, which is the identifier of a sensor orgroup and a value, which is a set of all groups, this sensor(group) is part of. This table can, for example, be createdfrom a stream, which captures changes in the hierarchy. Itis not important which hierarchy a group belongs to, theonly requirement is that identifiers are unique among multiplehierarchies.
2) Merging Measurement Streams:
The first operatormerges the input stream with a stream of already calculated ag-gregation results (see step 7). Note that the aggregated streammay require an additional converting step. In the following,we make no distinction between sensor measurements andaggregation results and call these values simply measurements.
3) Joining Measurement Stream and Group Table:
Themeasurement stream and the groups table are joined usingan inner join operation. This leads to a new update of theresulting table whenever either a measurement arrives or thegroups a sensor belongs to changes. The result of this joinoperation is a tuple consisting of the measurement and the setof all groups this measurement has effect on.
4) Duplicating Join Results:
In the next step, the mea-surements are duplicated in a way that for each group anew record is forwarded. This record has the following form: The key is a pair of sensor identifier and the correspondinggroup this record is created for. The value is the measurement.This operation is stateful as it always stores the last set ofsensor groups and compares them with the new one. If asensor was part of a group in a previous record but not inthe currently processed one, a special record is forwardedwith a “tombstone” value. This value serves for informing thefollowing topology operators that the corresponding sensor isnot longer part of the corresponding group.
5) Immediate Result: Last Value Table:
The duplicatedrecords are materialized to a table, which lists the last mea-sured value per sensor and group. An arriving tombstonerecord for a sensor-group-pair deletes the corresponding entryin the table. This table of last values is the entry point for thefollowing aggregation.
6) Grouping and Aggregating:
Similar to an SQL group-by operation, table entries are grouped by their group name(second part of the key). The result is a grouped tablecontaining one entry per group identifier. This table is thenaggregated using appropriate adding and subtracting functionsresulting in one aggregation result per group identifier. Asdefined by the dual streaming model, whenever an entry inthe last values table is updated, a corresponding update recordupdates the grouped table and triggers the computation of anew aggregation result. This is done by first calling the subtractfunction for the previous record (e.g., subtracting the sensorsprevious measurement from the total group’s value), followedby calling the add function for the new value (e.g., adding thesensors new measurement to the total group’s value). Deletingan entry in the last value table (via a tombstone record) solelycauses calling the subtract function.
7) Output: Aggregation Results:
As a last step, the aggre-gated values per sensor group are published to a data stream. ig. 3. Screenshot of the Titan Control Center’s comparison view. It shows the overall power consumption of the
Kieler Nachrichten Druckzentrum incomparison to the main subconsumers. The overall consumption was continuously computed by aggregating the measurements of all subconsumers.
On the one hand, this stream serves as an interface so thatother applications or services can use these data as they werereal measurements. On the other hand, the stream is fed backto the beginning of the topology, where it can be used tocompute aggregated values for sensor groups containing thisgroup.V. I
NDUSTRIAL C ASE S TUDY : I O T S
ENSOR D ATA
In this section, we return to our motivating example fromSection II and show in a pilot implementation with our partner
Kieler Nachrichten how our proposed architecture can be usedto aggregate IoT sensor data. The Titan Control Center [12]is a microservice-based application for analyzing the energyconsumption in manufacturing enterprises. It integrates energyconsumption data of different data sources (e.g., machine-leveldata, building technology, or external software systems) andaggregates, analyzes, and visualizes them in near real time.We extended the Control Center’s architecture by an addi-tional microservice, which implements the topology describedabove and replaces a former, less scalable and reliable dataaggregation. This microservice subscribes to a stream provid-ing energy consumption data of individual machines and to astream forwarding changes to the sensor hierarchy (publishedby the Configuration microservice of the Control Center). Itaggregates the data of all sensors in a group by summing themup and publishes every result to a dedicated topic allowingother services to subscribe to this aggregated data. Othermicroservices use these data, for example, to calculate powerconsumption statistics, to produce forecasts, or to visualize it.Specifying the proposed architecture with the dual stream-ing model allows for a straightforward implementation inKafka Streams. Nevertheless, as the dual streaming modelis more abstract than the Kafka Streams API, we had tointroduce some additions: In some places, streams had to beexplicitly converted into tables via reduce operations. The datatype of aggregated records does not match the data type forsensor measurements, thus, an additional map operation before merging with the sensor data stream was necessary. To avoidemitting subtract events from the table of aggregations, thesehave to be explicitly filtered out. The operator duplicatingrecords for each parent had to be implemented as a custom flatTransform operator, as Kafka Streams currently does notsupport flatMap operations on tables. Using Kafka Streamspromises a high degree of scalability and reliability, as de-ploying multiple instances of this microservice causes KafkaStreams to balance the load among all instance based on thetopology and the number of configured Kafka topic partitions.Our implementation is used in production with two manu-facturing enterprises to apply Industrial DevOps [13]. In theseenterprises, the aggregation results are used, for example, togain insights into how much energy is used by certain typesof machines (e.g., the overall air conditioning), how big thedifference is between measured company-wide energy con-sumption and the sum of all known consumers, or how muchan individual machine contributes to the overall consumptionof all machines of that type. Fig. 3 shows a screenshot ofthe Titan Control Center’s comparison view. It visualizesthe total electrical power consumption of our partner
KielerNachrichten Druckzentrum , a newspaper printing company, incomparison to the consumption of its major consumers.VI. E
XPERIMENTAL E VALUATION
We experimentally evaluate the scalability and the reliabilityof the proposed stream processing architecture. For theseevaluations, we simulate different numbers of power meteringsensors and aggregate their data streams to pre-defined groupsusing the Titan Control Center (see above). Each simulatedsensor emits one measurement per second. We record both thenumber of sensor measurements per sensor and the numberof computed aggregation results per second as well as theaverage latency per second between sensor record generationand obtaining the result of an aggregation. The experimentalsetup is deployed in a Kubernetes cluster of 4 nodes each generated records/second r equ i r ed i n s t an c e s Fig. 4. Number of required instances for aggregation in relation to the numberof generated records per second. The size of points indicates the frequencyof the respective observation. The black line connects the median numbers ofrequired instances per workload. equipped with 384 GB RAM and × CPU cores providing64 threads (overall 256 parallel threads).
1) Evaluation of Scalability:
In order to assess the scalabil-ity of our proposed approach, we evaluate how an increasingamount of input data can be handled by an increasing numberof aggregation instances. For this purpose, we determine thenumber of instances required to aggregate the data streams of agiven workload. We group 8 simulated sensors into one groupand group 8 of such groups again into one larger group. Weevaluate 4 workloads, where we simulate 2, 3, 4, or 5 nestedgroups, resulting in a total amount of sensors of , , ,and and a corresponding number of records per second.For each scenario, we deploy different numbers of instancesranging from 1 to 128.We consider a deployment (i.e., a certain set of deployedinstances) as being sufficient to handle the given workload ifall simulated data can be processed such that no data recordsare piling up between their generation and the aggregation.To obtain this information, we measure the latency betweengeneration and aggregation. If this latency remains reasonablyconstant, we conclude that records are aggregated at approxi-mately the same speed as they are generated. We apply linearregression to compute a trend line and consider the processingto be reasonably constant if the trend line’s slope is lessthan 100 ms. For each evaluated workload, we determine theminimum number of instances, which is required to handlethat workload, i.e., shows a latency trend line with a slope ofless than 100 ms. This evaluation is repeated 10 times.Fig. 4 shows the median over all repetitions of the requirednumber of instances per workload. We notice that the requirednumber of instances scales linearly with the amount of datato be aggregated.
2) Evaluation of Reliability:
In order to assess the reliabil-ity of our proposed approach, we evaluate how the architecture Due to Kafka Streams’ task model, the throughput is subject to largefluctuations. The calculated trend line can therefore be inaccurate, suggestinga more conservative threshold for the slope of 100 ms. Lower threshold returnsimilar median results, but produces more outliers. start faultinjection end faultinjection seconds since evaluation start p r o c e ss ed r e c o r d s / s e c ond Fig. 5. Number of processed records per second over a moving average of60 seconds in the course of time. The two dashed vertical lines indicate thepoint in time at which the simulated failure starts or ends. behaves when components fail. We generate the workloaddescribed in the scalability evaluation, which simulates 4nested groups of sensors, and aggregate the data with 24instances of the aggregation component. After 10 minutes ofprocessing, we inject a failure during operation by shuttingdown 18 instances and starting them again after 5 more min-utes. We measure both the number of generated messages andthe amount of aggregation results during the entire evaluation.Since both values fluctuate strongly, we additionally calculatea moving average over a 60 seconds window.The average number of processed records per second overtime is presented in Fig. 5. It can be seen that the amount ofperformed aggregations decreases sharply during the simulatedfailure. This is reasonable as there are not enough resourcesavailable anymore to process all data. However, if the stoppedinstances are replaced by new instances, the amount of pro-cessed data increases again. Furthermore, as we set the numberof processing instances twice the number actually necessary(cf. scalability evaluation), the piled up data is also processed.VII. L
IMITATIONS AND P OSSIBLE E XTENSIONS
Our proposed architecture has two limitations: First, it yieldsincomplete results for the first records until every sensorgenerated at least one measurement. This could be fixed byadding additional logic in the flat transform step or for manyuse cases simply be ignored as after a short warm-up theapproach would work as expected.The second limitation concerns out-of-order records. Ac-cording to the dual streaming model, a late arriving recordwould cause a recomputation of all consecutive steps. Whilethis is in principle desired, the corresponding lookup in the lastvalues table only yields the most recent value causing wrongresults. A simple, yet in many cases sufficient solution is tosimply reject late arriving records. A more complex, but alsocorrect solution is to introduce windowing operations, whichcompute aggregation results for time windows of, e.g., oneminute. In effect, this approach would maintain not only onelast values table but instead one last values table per timewindow.III. R
ELATED W ORK
Joins in stream processing (analogous to those in SQL) [14]allow to connect different or identical streams. As the joinoperation is a bivariate function and thus aggregating multiplestreams would require a chain of join operations, a respectivepipeline can only statically be created and not dynamicallyadjusted at runtime. This contrasts our approach, which allowsreconfigurations at runtime by changing the sensor groupstable or an underlying stream.Another type of aggregating data in streams is along thedimension of time. In this case, records within the sametime window having the same key are aggregated to a newrecord. Depending on the actual requirements, records can beaggregated within fixed windows, sliding windows, or session-based windows [10]. In addition to research on the efficientcomputation of window aggregations [15], [16], there are alsopublications proposing software architectures for a temporalaggregation platform, for example by Twitter [17]. Temporalaggregations are compatible with our approach, which meansthat temporal aggregated streams of sensor data can be furtheraggregated to groups and streams for sensor groups can befurther aggregated over time.As most approaches applying stream processing, our pre-sented approach is primarily based on data parallelism, mean-ing that (sub)topologies exist in multiple instances, each pro-cessing a portion of the data. Pipe-and-Filter frameworks suchas TeeTime [18] employ task parallelism, where the individualfilters (operators) are executed in parallel. Since in this way alldata pass each filter, the identified requirements can be realizedin a single filter. In contrast to the solution we presented,however, scalability would be significantly compromised.IX. C
ONCLUSIONS
Software systems, which analyze or react on sensor datastreams, often have to process not only raw measurementsbut also aggregated data for groups of sensors. In this paper,we presented our Titan approach for continuously aggregat-ing sensor data. It supports the aggregation of hierarchicalgroups, multiple such groups in parallel, and reconfigurationsat runtime. For this purpose, we designed a stream processingarchitecture, which can be integrated (e.g., as a microservice)into existing big data architectures. It consists of a topology ofstream processing operators, requires two input data streams,which provide sensor group hierarchies as well as the sensordata stream, and provides an output stream of aggregateddata. We provide an implementation of this architecture forpower consumption data and show how our implementationcan be integrated into an analytics platform used in industry.Furthermore, in an experimental evaluation, we show thatour proposed architecture scales linearly with the amount ofsensors and tolerates faults during operation. A replicationpackage and experimental results provided as supplementalmaterial allow to repeat and extend our work [19].For future work, we plan to add optional support for out-of-order records by aggregating measurements in time windowsas described in Section VII. Furthermore, future work may explore how our architecture can be implemented with otherstream processing engines, for example, by considering recenttrends towards uniform stream query languages [20]. As wewere not able to discover any scalability limitations in ourconducted experimental evaluation, we also plan to conductexperiments with even larger amounts of sensors.R
EFERENCES[1] G. Cugola and A. Margara, “Processing flows of information: From datastream to complex event processing,”
ACM Computing Surveys , vol. 44,no. 3, 2012.[2] H. R¨oger and R. Mayer, “A comprehensive survey on parallelizationand elasticity in stream processing,”
ACM Computing Surveys , vol. 52,no. 2, 2019.[3] J. Karimov, T. Rabl, A. Katsifodimos, R. Samarev, H. Heiskanen, andV. Markl, “Benchmarking distributed stream data processing systems,”in
Proc. IEEE International Conference on Data Engineering , 2018.[4] J. Lin, “The Lambda and the Kappa,”
IEEE Internet Computing , vol. 21,no. 5, 2017.[5] W. Hasselbring and G. Steinacker, “Microservice architectures for scala-bility, agility and reliability in e-commerce,” in
Proc. IEEE InternationalConference on Software Architecture Workshops , 2017.[6] M. Fowler, “Event sourcing,” 2005, accessed: 2019-11-13. [Online].Available: https://martinfowler.com/eaaDev/EventSourcing.html[7] M. J. Sax, G. Wang, M. Weidlich, and J.-C. Freytag, “Streams andtables: Two sides of the same coin,” in
Proc. International Workshop onReal-Time Business Intelligence and Analytics , 2018.[8] D. J. Abadi, D. Carney, U. C¸ etintemel, M. Cherniack, C. Convey, S. Lee,M. Stonebraker, N. Tatbul, and S. Zdonik, “Aurora: A new model andarchitecture for data stream management,”
The VLDB Journal , vol. 12,no. 2, 2003.[9] T. Akidau, A. Balikov, K. Bekiro˘glu, S. Chernyak, J. Haberman, R. Lax,S. McVeety, D. Mills, P. Nordstrom, and S. Whittle, “Millwheel: Fault-tolerant stream processing at internet scale,”
Proc. VLDB Endow. , vol. 6,no. 11, 2013.[10] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fern´andez-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, andS. Whittle, “The dataflow model: A practical approach to balancingcorrectness, latency, and cost in massive-scale, unbounded, out-of-orderdata processing,”
Proc. VLDB Endow. , vol. 8, no. 12, 2015.[11] G. Wang, J. Koshy, S. Subramanian, K. Paramasivam, M. Zadeh,N. Narkhede, J. Rao, J. Kreps, and J. Stein, “Building a replicatedlogging system with Apache Kafka,”
Proc. VLDB Endow. , vol. 8, no. 12,2015.[12] S. Henning, W. Hasselbring, and A. M¨obius, “A scalable architecture forpower consumption monitoring in industrial production environments,”in
Proc. IEEE International Conference on Fog Computing , 2019.[13] W. Hasselbring, S. Henning, B. Latte, A. M¨obius, T. Richter, S. Schalk,and M. Wojcieszak, “Industrial DevOps,” in
Proc. IEEE InternationalConference on Software Architecture Companion , 2019.[14] A. Arasu, S. Babu, and J. Widom, “The CQL continuous query language:Semantic foundations and query execution,”
The VLDB Journal , vol. 15,no. 2, 2006.[15] J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker, “No pane,no gain: Efficient evaluation of sliding-window aggregates over datastreams,”
SIGMOD Rec. , vol. 34, no. 1, 2005.[16] J. Traub, P. Grulich, A. R. Cu´ellar, S. Breß, A. Katsifodimos, T. Rabl,and V. Markl, “Efficient window aggregation with general streamslicing,” in
Proc. Int. Conf. on Extending Database Technology , 2019.[17] P. Yang, S. Thiagarajan, and J. Lin, “Robust, scalable, real-time eventtime series aggregation at Twitter,” in
Proc. International Conferenceon Management of Data , 2018.[18] C. Wulf, W. Hasselbring, and J. Ohlemacher, “Parallel and genericpipe-and-filter architectures with TeeTime,” in
Proc. IEEE InternationalConference on Software Architecture Workshops , 2017.[19] S. Henning and W. Hasselbring, “Replication package for: Scalable andreliable multi-dimensional aggregation of sensor data streams,” 2019.[Online]. Available: https://doi.org/10.5281/zenodo.3540896[20] E. Begoli, T. Akidau, F. Hueske, J. Hyde, K. Knight, and K. Knowles,“One SQL to rule them all – an efficient and syntactically idiomaticapproach to management of streams and tables,” in