Luping Ding
Worcester Polytechnic Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Luping Ding.
international conference on data engineering | 2008
Luping Ding; Songting Chen; Elke A. Rundensteiner; Junichi Tatemura; Wang-Pin Hsiung; K.S. Candan
Detecting complex patterns in event streams, i.e., complex event processing (CEP), has become increasingly important for modern enterprises to react quickly to critical situations. In many practical cases business events are generated based on pre-defined business logics. Hence constraints, such as occurrence and order constraints, often hold among events. Reasoning using these known constraints enables us to predict the non-occurrences of certain future events, thereby helping us to identify and then terminate the long running query processes that are guaranteed to not lead to successful matches. In this work, we focus on exploiting event constraints to optimize CEP over large volumes of business transaction streams. Since the optimization opportunities arise at runtime, we develop a runtime query unsatisfiability (RunSAT) checking technique that detects optimal points for terminating query evaluation. To assure efficiency of RunSAT checking, we propose mechanisms to precompute the query failure conditions to be checked at runtime. This guarantees a constant-time RunSAT reasoning cost, making our technique highly scalable. We realize our optimal query termination strategies by augmenting the query with Event-Condition-Action rules encoding the pre-computed failure conditions. This results in an event processing solution compatible with state-of-the-art CEP architectures. Extensive experimental results demonstrate that significant performance gains are achieved, while the optimization overhead is small.
extending database technology | 2004
Luping Ding; Nishant K. Mehta; Elke A. Rundensteiner; George T. Heineman
We focus on stream join optimization by exploiting the constraints that are dynamically embedded into data streams to signal the end of transmitting certain attribute values. These constraints are called punctuations. Our stream join operator, PJoin, is able to remove no-longer-useful data from the state in a timely manner based on punctuations, thus reducing memory overhead and improving the efficiency of probing. We equip PJoin with several alternate strategies for purging the state and for propagating punctuations to benefit down-stream operators. We also present an extensive experimental study to explore the performance gains achieved by purging state as well as the trade-off between different purge strategies. Our experimental results of comparing the performance of PJoin with XJoin, a stream join operator without a constraint-exploiting mechanism, show that PJoin significantly outperforms XJoin with regard to both memory overhead and throughput.
conference on information and knowledge management | 2004
Luping Ding; Elke A. Rundensteiner
We explore join optimizations in the presence of both time-based constraints (sliding windows) and value-based constraints (punctuations). We present the first join solution named PWJoin that exploits such combined constraints to shrink the runtime join state and to propagate punctuations to benefit downstream operators. We design a state structure for PWJoin that facilitates the exploitation of both constraint types. We also explore optimizations enabled by the interactions between window and punctuation, e.g., early punctuation propagation. The costs of the PWJoin are analyzed using a cost model. We also conduct an experimental study using CAPE continuous query system. The experimental results show that in most cases, by exploiting punctuations, PWJoin outperforms the pure window join with regard to both memory overhead and throughput. Our technique complements the joins in the literature, such as symmetric hash join or window join, to now require less runtime resources without compromising the accuracy of the result.
international conference on distributed computing systems workshops | 2007
Ming Li; Mo Liu; Luping Ding; Elke A. Rundensteiner; Murali Mani
Complex event processing has become increasingly important in modern applications, ranging from supply chain management for RFID tracking to real-time intrusion detection. The goal is to extract patterns from such event streams in order to make informed decisions in real-time. However, networking latencies and even machine failure may cause events to arrive out-of-order at the event stream processing engine. In this work, we address the problem of processing event pattern queries specified over event streams that may contain out-of-order data. First, we analyze the problems state-of-the-art event stream processing technology would experience when faced with out-of-order data arrival. We then propose a new solution of physical implementation strategies for the core stream algebra operators such as sequence scan and pattern construction, including stack- based data structures and associated purge algorithms. Optimizations for sequence scan and construction as well as state purging to minimize CPU cost and memory consumption are also introduced. Lastly, we conduct an experimental study demonstrating the effectiveness of our approach.
web information and data management | 2002
Maged El-Sayed; Ling Wang; Luping Ding; Elke A. Rundensteiner
Modern data sources, including structural and semi-structural sources, often export XML views over base data, and at times may materialize their views by storing the XML query result to provide faster data access. It is typically more efficient to maintain a view by incrementally propagating the base changes to the view than by re-computing it from scratch. Techniques for the incremental maintenance of relational views have been extensively studied in the literature. However, the maintenance of views created using XQuery is as of now unexplored. In this paper we propose an algebraic approach for incremental XQuery view maintenance. In our approach, an update to the XML source is transformed into a set of well defined update primitives which are propagated through the XML algebra tree. This algebraic update propagation process generates incremental update primitives to be applied to the result view. We briefly discuss our XQuery view maintenance system implementation. Our experiments confirm that incremental view maintenance is indeed faster than re-computation.
international conference on management of data | 2003
Xin Zhang; Katica Dimitrova; Ling Wang; Maged El Sayed; Brian Murphy; Bradford Pielech; Mukesh Mulchandani; Luping Ding; Elke A. Rundensteiner
Outlook. We present multiple XQuery optimization based on materialized XML view technology in the Rainbow system. In this demo we in particular show: (1) Rainbow’s support for defining and incrementally maintaining materialized XQuery views, (2) XQuery optimization by query rewriting to use materialized views, (3) Performing multiple query optimization by merging multiple XML queries (XATs) into one global access plan to decide upon materialization of intermediate results as views, and (4) Query processing of updates issued on XML views that wrap relational data by decomposing the updates into SQL update statements and consistency checks on the relational base data. The Rainbow System. We have extended Rainbow [1], our existing XML data management system, as shown in Figure 1. Rainbow accepts an XQuery query or an update request in an extended XQuery syntax from the user. The XQuery is parsed into an algebraic representation, called XML Algebra Tree (XAT) [3]. The XAT is then optimized by the global query optimizer using algebraic rewrite rules [2]. We have introduced a separate phase of XAT cleanup [2] which includes the XAT table schema cleanup and cutting of unnecessary XML operators. This optimization often significantly improves the query performance. The optimized XAT is then executed by the query manager.
international database engineering and applications symposium | 2005
Timothy M. Sutherland; Bradford Pielech; Yali Zhu; Luping Ding; Elke A. Rundensteiner
Adaptive operator scheduling algorithms for continuous query processing are usually designed to serve a single performance objective, such as minimizing memory usage or maximizing query throughput. We observe that different performance objectives may sometimes conflict with each other. Also due to the dynamic nature of streaming environments, the performance objective may need to change dynamically. Furthermore, the performance specification defined by users may itself be multi-dimensional. Therefore, utilizing a single scheduling algorithm optimized for a single objective is no longer sufficient. In this paper, we propose a novel adaptive scheduling algorithm selection framework named AMoS. It is able to leverage the strengths of existing scheduling algorithms to meet multiple performance objectives. AMoS employs a lightweight learning mechanism to assess the effectiveness of each algorithm. The learned knowledge can be used to select the algorithm that probabilistically has the best chance of improving the performance. In addition, AMoS has the flexibility to add and adapt to new scheduling algorithms, query plans and data sets during execution. Our experimental results show that AMoS significantly outperforms the existing scheduling algorithms with regard to satisfying both uni-objective and multi-objective performance requirements.
distributed event-based systems | 2003
Luping Ding; Elke A. Rundensteiner; George T. Heineman
Join algorithms must be re-designed when processing stream data instead of persistently stored data. Data streams are potentially infinite and the query result is expected to be generated incrementally instead of once only. Data arrival patterns are often unpredictable and the statistics of the data and other relevant metadata often are only known at runtime. In some cases they are supplied interleaved with the actual data in the form of stream markers. Recently, stream join algorithms, like Symmetric Hash Join and XJoin, have been designed to perform in a pipelined fashion to cope with the latent delivery of data. However, none of them to date takes metadata, especially runtime metadata, into consideration. Hence, the join execution logic defined statically before runtime may not be well suited to deal with varying types of dynamic runtime scenarios. Also the potentially unbounded state needs to be maintained by the join operator to guarantee the precision of the result. In this paper, we propose a metadata-aware stream join operator called MJoin which is able to exploit metadata to (1) detect and purge useless materialized data to save computation resources and (2) optimize the execution logic to target diferent optimization goals. We have implemented the MJoin operator. The experimental results validate our metadata-driven join optimization strategies.
international conference on data engineering | 2011
Luping Ding; Karen Works; Elke A. Rundensteiner
Data stream management systems (DSMS) processing long-running queries over large volumes of stream data must typically deliver time-critical responses. We propose the first semantic query optimization (SQO) approach that utilizes dynamic substream metadata at runtime to find a more efficient query plan than the one selected at compilation time. We identify four SQO techniques guaranteed to result in performance gains. Based on classic satisfiability theory we then design a lightweight query optimization algorithm that efficiently detects SQO opportunities at runtime. At the logical level, our algorithm instantiates multiple concurrent SQO plans, each processing different partially overlapping substreams. Our novel execution paradigm employs multi-modal operators to support the execution of these concurrent SQO logical plans in a single physical plan. This highly agile execution strategy reduces resource utilization while supporting lightweight adaptivity. Our extensive experimental study in the CAPE stream processing system using both synthetic and real data confirms that our optimization techniques significantly reduce query execution times, up to 60%, compared to the traditional approach.
Stream Data Management | 2005
Elke A. Rundensteiner; Luping Ding; Yali Zhu; Timothy M. Sutherland; Bradford Pielech
The growth of electronic commerce and the widespread use of sensor networks has created the demand for online processing and monitoring applications [17, 23, 26]. In these applications, data is no longer statically stored. Instead, it becomes available in the form of continuous streams. Furthermore, users often ask long-running queries and expect the results to be delivered incrementally in real time. Traditional query execution techniques, which assume finite persistent datasets and aim for producing a one-time query result, become largely inapplicable in this new stream paradigm due to the following reasons: