Irina Botan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Irina Botan is active.

Explore More

Publication

Featured researches published by Irina Botan.

very large data bases | 2010

SECRET: a model for analysis of the execution semantics of stream processing systems

Irina Botan; Roozbeh Derakhshan; Nihal Dindar; Laura M. Haas; Renée J. Miller; Nesime Tatbul

The IEEE Board of Directors is due to consider a proposal to change IEEE Policy Statement 9.7 to permit IEEE entities to sponsor or cosponsor classified sessions anywhere in the world. The policy has been to prohibit such (co)sponsorship on the grounds that every dues-paying IEEE member has a fundamental right to attend any and all IEEE events. That right should not be abrogated.There are many academic and commercial stream processing engines (SPEs) today, each of them with its own execution semantics. This variation may lead to seemingly inexplicable differences in query results. In this paper, we present SECRET, a model of the behavior of SPEs. SECRET is a descriptive model that allows users to analyze the behavior of systems and understand the results of window-based queries for a broad range of heterogeneous SPEs. The model is the result of extensive analysis and experimentation with several commercial and academic engines. In the paper, we describe the types of heterogeneity found in existing engines, and show with experiments on real systems that our model can explain the key differences in windowing behavior.

extending database technology | 2009

Flexible and scalable storage management for data-intensive stream processing

Irina Botan; Gustavo Alonso; Peter Fischer; Donald Kossmann; Nesime Tatbul

Data Stream Management Systems (DSMS) operate under strict performance requirements. Key to meeting such requirements is to efficiently handle time-critical tasks such as managing internal states of continuous query operators, traffic on the queues between operators, as well as providing storage support for shared computation and archived data. In this paper, we introduce a general purpose storage management framework for DSMSs that performs these tasks based on a clean, loosely-coupled, and flexible system design that also facilitates performance optimization. An important contribution of the framework is that, in analogy to buffer management techniques in relational database systems, it uses information about the access patterns of streaming applications to tune and customize the performance of the storage manager. In the paper, we first analyze typical application requirements at different granularities in order to identify important tunable parameters and their corresponding values. Based on these parameters, we define a general-purpose storage management interface. Using the interface, a developer can use our SMS (Storage Manager for Streams) to generate a customized storage manager for streaming applications. We explore the performance and potential of SMS through a set of experiments using the Linear Road benchmark.

very large data bases | 2013

Modeling the execution semantics of stream processing engines with SECRET

Nihal Dindar; Nesime Tatbul; Renée J. Miller; Laura M. Haas; Irina Botan

There are many academic and commercial stream processing engines (SPEs) today, each of them with its own execution semantics. This variation may lead to seemingly inexplicable differences in query results. In this paper, we present SECRET, a model of the behavior of SPEs. SECRET is a descriptive model that allows users to analyze the behavior of systems and understand the results of window-based queries (with time- and tuple-based windows) for a broad range of heterogeneous SPEs. The model is the result of extensive analysis and experimentation with several commercial and academic engines. In the paper, we describe the types of heterogeneity found in existing engines and show with experiments on real systems that our model can explain the key differences in windowing behavior.

extending database technology | 2012

Transactional stream processing

Irina Botan; Peter Fischer; Donald Kossmann; Nesime Tatbul

Many stream processing applications require access to a multitude of streaming as well as stored data sources. Yet there is no clear semantics for correct continuous query execution over these data sources in the face of concurrent access and failures. Instead, todays Stream Processing Systems (SPSs) hard-code transactional concepts in their execution models, making them both hard to understand and inflexible to use. In this paper, we show that we can successfully reuse the traditional transactional theory (with some minimal extensions) in order to cleanly define the correct interaction of a set of continuous and one-time queries concurrently accessing both streaming and stored data sources. The result is a unified transactional model (UTM) for query processing over streams as well as traditional databases. We present a transaction manager that implements this model on top of an existing storage manager for streams (MXQuery/SMS). Experiments on the Linear Road Benchmark show that our transaction manager flexibly ensures correctness in case of concurrency and failures, without sacrificing from performance. Moreover, this model is powerful enough to express the implicit transactional behaviors of a representative set of state-of-the-art SPSs.

business intelligence for the real-time enterprises | 2009

Federated Stream Processing Support for Real-Time Business Intelligence Applications

Irina Botan; Younggoo Cho; Roozbeh Derakhshan; Nihal Dindar; Laura M. Haas; Kihong Kim; Nesime Tatbul

In this paper, we describe the MaxStream federated stream processing architecture to support real-time business intelligence applications. MaxStream builds on and extends the SAP MaxDB relational database system in order to provide a federator over multiple underlying stream processing engines and databases. We show preliminary results on usefulness and performance of the MaxStream architecture on the SAP Sales and Distribution Benchmark.

international conference on data engineering | 2010

A demonstration of the MaxStream federated stream processing system

Irina Botan; Younggoo Cho; Roozbeh Derakhshan; Nihal Dindar; Ankush Gupta; Laura M. Haas; Kihong Kim; Chulwon Lee; Girish Mundada; Ming-Chien Shan; Nesime Tatbul; Ying Yan; Beomjin Yun; Jin Zhang

MaxStream is a federated stream processing system that seamlessly integrates multiple autonomous and heterogeneous Stream Processing Engines (SPEs) and databases. In this paper, we propose to demonstrate the key features of MaxStream using two application scenarios, namely the Sales Map & Spikes business monitoring scenario and the Linear Road Benchmark, each with a different set of requirements. More specifically, we will show how the MaxStream Federator can translate and forward the application queries to two different commercial SPEs (Coral8 and StreamBase), as well as how it does so under various persistency requirements.

very large data bases | 2011

UpStream: storage-centric load management for streaming applications with update semantics

Alexandru Moga; Irina Botan; Nesime Tatbul

This paper addresses the problem of minimizing the staleness of query results for streaming applications with update semantics under overload conditions. Staleness is a measure of how out-of-date the results are compared with the latest data arriving on the input. Real-time streaming applications are subject to overload due to unpredictably increasing data rates, while in many of them, we observe that data streams and queries in fact exhibit “update semantics” (i.e., the latest input data are all that really matters when producing a query result). Under such semantics, overload will cause staleness to build up. The key to avoid this is to exploit the update semantics of applications as early as possible in the processing pipeline. In this paper, we propose UpStream, a storage-centric framework for load management over streaming applications with update semantics. We first describe how we model streams and queries that possess the update semantics, providing definitions for correctness and staleness for the query results. Then, we show how staleness can be minimized based on intelligent update key scheduling techniques applied at the queue level, while preserving the correctness of the results, even for complex queries that involve sliding windows. UpStream is based on the simple idea of applying the updates in place, yet with great returns in terms of lowering staleness and memory consumption, as we also experimentally verify on the Borealis system.

very large data bases | 2007

Extending XQuery with window functions

Irina Botan; Donald Kossmann; Peter Fischer; Tim Kraska; Dana Florescu; Rokas Tamosevicius

Technical report / ETH, Department of Computer Science | 2009

Design and Implementation of the MaxStream Federated Stream Processing Architecture

Irina Botan; Younggoo Cho; Roozbeh Derakhshan; Nihal Dindar; Laura M. Haas; Kihong Kim; Chulwon Lee; Girish Mundada; Ming-Chien Shan; Nesime Tatbul; Ying Yan; Beomjin Yun; Jin Zhang

Archive | 2009