Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stan Zdonik is active.

Publication


Featured researches published by Stan Zdonik.


international conference on data engineering | 2005

High-availability algorithms for distributed stream processing

Jeong-Hyon Hwang; Magdalena Balazinska; Alexander Rasin; Uǧur Çetintemel; Michael Stonebraker; Stan Zdonik

Stream-processing systems are designed to support an emerging class of applications that require sophisticated and timely processing of high-volume data streams, often originating in distributed environments. Unlike traditional data-processing applications that require precise recovery for correctness, many stream-processing applications can tolerate and benefit from weaker recovery guarantees. In this paper, we study various recovery guarantees and pertinent recovery techniques that can meet the correctness and performance requirements of stream-processing applications. We discuss the design and algorithmic challenges associated with the proposed recovery techniques and describe how each can provide different guarantees with proper combinations of redundant processing, checkpointing, and remote logging. Using analysis and simulations, we quantify the cost of our recovery guarantees and examine the performance and applicability of the recovery techniques. We also analyze how the knowledge of query network properties can help decrease the cost of high availability.


international conference on management of data | 2005

Distributed operation in the Borealis stream processing engine

Yanif Ahmad; Bradley Berg; Uǧur Çetintemel; Mark Humphrey; Jeong-Hyon Hwang; Anjali Jhingran; Anurag S. Maskey; Olga Papaemmanouil; Alexander Rasin; Nesime Tatbul; Wenjuan Xing; Ying Xing; Stan Zdonik

Borealis is a distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora and inter-node communication functionality from Medusa.We propose to demonstrate some of the key aspects of distributed operation in Borealis, using a multi-player network game as the underlying application. The demonstration will illustrate the dynamic resource management, query optimization and high availability mechanisms employed by Borealis, using visual performance-monitoring tools as well as the gaming experience.


international conference on management of data | 2015

The BigDAWG Polystore System

Jennie Duggan; Aaron J. Elmore; Michael Stonebraker; Magdalena Balazinska; Bill Howe; Jeremy Kepner; Samuel Madden; David Maier; Timothy G. Mattson; Stan Zdonik

This paper presents a new view of federated databases to address the growing need for managing information that spans multiple data models. This trend is fueled by the proliferation of storage engines and query languages based on the observation that â no one size fits allâ . To address this shift, we propose a polystore architecture; it is designed to unify querying over multiple data models. We consider the challenges and opportunities associated with polystores. Open questions in this space revolve around query optimization and the assignment of objects to storage engines. We introduce our approach to these topics and discuss our prototype in the context of the Intel Science and Technology Center for Big Data


very large data bases | 2015

An architecture for compiling UDF-centric workflows

Andrew Crotty; Alex Galakatos; Kayhan Dursun; Tim Kraska; Carsten Binnig; Ugur Çetintemel; Stan Zdonik

Data analytics has recently grown to include increasingly sophisticated techniques, such as machine learning and advanced statistics. Users frequently express these complex analytics tasks as workflows of user-defined functions (UDFs) that specify each algorithmic step. However, given typical hardware configurations and dataset sizes, the core challenge of complex analytics is no longer sheer data volume but rather the computation itself, and the next generation of analytics frameworks must focus on optimizing for this computation bottleneck. While query compilation has gained widespread popularity as a way to tackle the computation bottleneck for traditional SQL workloads, relatively little work addresses UDF-centric workflows in the domain of complex analytics. In this paper, we describe a novel architecture for automatically compiling workflows of UDFs. We also propose several optimizations that consider properties of the data, UDFs, and hardware together in order to generate different code on a case-by-case basis. To evaluate our approach, we implemented these techniques in Tupleware, a new high-performance distributed analytics system, and our benchmarks show performance improvements of up to three orders of magnitude compared to alternative systems.


Data Stream Management | 2016

The Aurora and Borealis Stream Processing Engines

Ugur Çetintemel; Daniel J. Abadi; Yanif Ahmad; Hari Balakrishnan; Magdalena Balazinska; Mitch Cherniack; Jeong-Hyon Hwang; Samuel Madden; Anurag S. Maskey; Alexander Rasin; Esther Ryvkina; Michael Stonebraker; Nesime Tatbul; Ying Xing; Stan Zdonik

Over the last several years, a great deal of progress has been made in the area of stream-processing engines (SPEs). Three basic tenets distinguish SPEs from current data processing engines. First, they must support primitives for streaming applications. Unlike Online Transaction Processing (OLTP), which processes messages in isolation, streaming applications entail time series operations on streams of messages. Second, streaming applications entail a real-time component. If one is content to see an answer later, then one can store incoming messages in a data warehouse and run a historical query on the warehouse to find information of interest. This tactic does not work if the answer must be constructed in real time. The need for real-time answers also dictates a fundamentally different storage architecture. DBMSs universally store and index data records before making them available for query activity. Such outbound processing, where data are stored before being processed, cannot deliver real-time latency, as required by SPEs. To meet more stringent latency requirements, SPEs must adopt an alternate model, which we refer to as “inbound processing”, where query processing is performed directly on incoming messages before (or instead of) storing them. Lastly, an SPE must have capabilities to gracefully deal with spikes in message load. Incoming traffic is usually bursty, and it is desirable to selectively degrade the performance of the applications running on an SPE. The Aurora stream-processing engine, motivated by these three tenets, is currently operational, has been used to build various application systems, and has been transferred to the commercial domain. Borealis is a distributed stream-processing system that inherits core stream-processing functionality from Aurora and enriches it with distribution functionality, in order to provide advanced capabilities that are commonly required by newly emerging stream-processing applications.


ieee high performance extreme computing conference | 2016

Integrating real-time and batch processing in a polystore

John Meehan; Stan Zdonik; Shaobo Tian; Yulong Tian; Nesime Tatbul; Adam Dziedzic; Aaron J. Elmore

This paper describes a stream processing engine called S-Store and its role in the BigDAWG polystore. Fundamentally, S-Store acts as a frontend processor that accepts input from multiple sources, and massages it into a form that has eliminated errors (data cleaning) and translates that input into a form that can be efficiently ingested into BigDAWG. S-Store also acts as an intelligent router that sends input tuples to the appropriate components of BigDAWG. All updates to S-Stores shared memory are done in a transactionally consistent (ACID) way, thereby eliminating new errors caused by non-synchronized reads and writes. The ability to migrate data from component to component of BigDAWG is crucial. We have described a migrator from S-Store to Postgres that we have implemented as a first proof of concept. We report some interesting results using this migrator that impact the evaluation of query plans.


business intelligence for the real-time enterprises | 2017

Towards Dynamic Data Placement for Polystore Ingestion

Jiang Du; John Meehan; Nesime Tatbul; Stan Zdonik

Integrating low-latency data streaming into data warehouse architectures has become an important enhancement to support modern data warehousing applications. In these architectures, heterogeneous workloads with data ingestion and analytical queries must be executed with strict performance guarantees. Furthermore, the data warehouse may consists of multiple different types of storage engines (a.k.a., polystores or multi-stores). A paramount problem is data placement; different workload scenarios call for different data placement designs. Moreover, workload conditions change frequently. In this paper, we provide evidence that a dynamic, workload-driven approach is needed for data placement in polystores with low-latency data ingestion support. We study the problem based on the characteristics of the TPC-DI benchmark in the context of an abbreviated polystore that consists of S-Store and Postgres.


conference on innovative data systems research | 2005

The Design of the Borealis Stream Processing Engine

Daniel J. Abadi; Yanif Ahmad; Magdalena Balazinska; Mitch Cherniack; Jeong-Hyon Hwang; Wolfgang Lindner; Anurag S. Maskey; Alexander Rasin; Esther Ryvkina; Nesime Tatbul; Ying Xing; Stan Zdonik


conference on innovative data systems research | 2009

Requirements for Science Data Bases and SciDB

Michael Stonebraker; David Maier; Oliver Ratzesberger; Stan Zdonik


conference on innovative data systems research | 2007

One Size Fits All? - Part 2: Benchmarking Results

Michael Stonebraker; Chuck Bear; Ugur Çetintemel; Mitch Cherniack; Tingjian Ge; Nabil Hachem; Stavros Harizopoulos; John Lifter; Jennie Rogers; Stan Zdonik

Collaboration


Dive into the Stan Zdonik's collaboration.

Top Co-Authors

Avatar

Michael Stonebraker

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Maier

Portland State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Samuel Madden

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge