Richard S. Sidle | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard S. Sidle is active.

Explore More

Publication

Featured researches published by Richard S. Sidle.

very large data bases | 2013

DB2 with BLU acceleration: so much more than just a column store

Vijayshankar Raman; Gopi K. Attaluri; Ronald J. Barber; Naresh K. Chainani; David Kalmuk; Vincent Kulandaisamy; Jens Leenstra; Sam Lightstone; Shaorong Liu; Guy M. Lohman; Tim R Malkemus; Rene Mueller; Ippokratis Pandis; Berni Schiefer; David C. Sharpe; Richard S. Sidle; Adam J. Storm; Liping Zhang

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLUs hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.

international conference on data engineering | 2008

Constant-Time Query Processing

Vijayshankar Raman; Garret Swart; Lin Qiao; Frederick R. Reiss; Vijay Dialani; Donald Kossmann; Inderpal Narang; Richard S. Sidle

Query performance in current systems depends significantly on tuning: how well the query matches the available indexes, materialized views etc. Even in a well tuned system, there are always some queries that take much longer than others. This frustrates users who increasingly want consistent response times to ad hoc queries. We argue that query processors should instead aim for constant response times for all queries, with no assumption about tuning. We present Blink, our first attempt at this goal, that runs every query as a table scan over a fully denormalized database, with hash group-by done along the way. To make this scan efficient, Blink uses a novel compression scheme that horizontally partitions tuples by frequency, thereby compressing skewed data almost down to entropy, even while producing long runs of fixed-length, easily-parseable values. We also present a scheme for evaluating a conjunction of range and equality predicates in SIMD fashion over compressed tuples, and different schemes for efficient hash-based aggregation within the L2 cache. A experimental study with a suite of arbitrary single block SQL queries over a TPCH-like schema suggests that constant-time queries can be efficient.

very large data bases | 2008

Row-wise parallel predicate evaluation

Ryan Johnson; Vijayshankar Raman; Richard S. Sidle; Garret Swart

Table scans have become more interesting recently due to greater use of ad-hoc queries and greater availability of multi-core, vector-enabled hardware. Table scan performance is limited by value representation, table layout, and processing techniques. In this paper we propose a new layout and processing technique for efficient one-pass predicate evaluation. Starting with a set of rows with a fixed number of bits per column, we append columns to form a set of banks and then pad each bank to a supported machine word length, typically 16, 32, or 64 bits. We then evaluate partial predicates on the columns of each bank, using a novel evaluation strategy that evaluates column level equality, range tests, IN-list predicates, and conjuncts of these predicates, simultaneously on multiple columns within a bank, and on multiple rows within a machine register. This approach outperforms pure column stores, which must evaluate the partial predicates one column at a time. We evaluate and compare the performance and representation overhead of this new approach and several proposed alternatives.

very large data bases | 2014

Memory-efficient hash joins

Ronald J. Barber; Guy M. Lohman; Ippokratis Pandis; Vijayshankar Raman; Richard S. Sidle; Gopi K. Attaluri; Naresh K. Chainani; Sam Lightstone; David C. Sharpe

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded population counts to almost entirely avoid collisions. This bitmap also serves as a Bloom filter for use in multi-table joins. We study the random access characteristics of hash joins, and renew the case for non-partitioned hash joins. We introduce a variant of partitioned joins in which only the build is partitioned, but the probe is not, as this is more efficient for large outer tables than traditional partitioned joins. This also avoids partitioning costs during the probe, while at the same time allowing parallel build without latching overheads. Additionally, we present a variant of CHT, called a concise array table (CAT), that can be used when the key domain is moderately dense. CAT is collision-free and avoids storing join keys in the hash table. We perform a detailed comparison of CHT and CAT against leading in-memory hash joins. Our experiments show that we can reduce the memory usage by one to three orders of magnitude, while also being competitive in performance.

international conference on management of data | 2000

Maintenance of cube automatic summary tables

Wolfgang Lehner; Richard S. Sidle; Hamid Pirahesh; Roberta Cochrane

Materialized views (or Automatic Summary Tables—ASTs) are commonly used to improve the performance of aggregation queries by orders of magnitude. In contrast to regular tables, ASTs are synchronized by the database system. In this paper, we present techniques for maintaining cube ASTs. Our implementation is based on IBM DB2 UDB.

international conference on data engineering | 2015

In-memory BLU acceleration in IBM's DB2 and dashDB: Optimized for modern workloads and hardware architectures

Ronald J. Barber; Guy M. Lohman; Vijayshankar Raman; Richard S. Sidle; Sam Lightstone; Berni Schiefer

Although the DRAM for main memories of systems continues to grow exponentially according to Moores Law and to become less expensive, we argue that memory hierarchies will always exist for many reasons, both economic and practical, and in particular due to concurrent users competing for working memory to perform joins and grouping. We present the in-memory BLU Acceleration used in IBMs DB2 for Linux, UNIX, and Windows, and now also the dashDB cloud offering, which was designed and implemented from the ground up to exploit main memory but is not limited to what fits in memory and does not require manual management of what to retain in memory, as its competitors do. In fact, BLU Acceleration views memory as too slow, and is carefully engineered to work in higher levels of the system cache by keeping the data encoded and packed densely into bit-aligned vectors that can exploit SIMD instructions in processing queries. To achieve scalable multi-core parallelism, BLU assigns to each thread independent data structures, or partitions thereof, designed to have low synchronization costs, and doles out batches of values to threads. On customer workloads, BLU has improved performance on complex analytics queries by 10 to 50 times, compared to the legacy row-organized run-time, while also significantly simplifying database administration, shortening time to value, and improving data compression. UPDATE and DELETE performance was improved by up to 112 times with the new Cancun release of DB2 with BLU Acceleration, which also added Shadow Tables for high performance on mixed OLTP and BI analytics workloads, and extended DB2s High Availability Disaster Recovery (HADR) and SQL compatibility features to BLUs column-organized tables.

business intelligence for the real time enterprises | 2011

Blink: Not Your Father's Database!

Ronald J. Barber; Peter Bendel; Marco Czech; Oliver Draese; Frederick Ho; Namik Hrle; Stratos Idreos; Min-Soo Kim; Oliver Koeth; Jae-Gil Lee; Tianchao Tim Li; Guy M. Lohman; Konstantinos Morfonios; Rene Mueller; Keshava Murthy; Ippokratis Pandis; Lin Qiao; Vijayshankar Raman; Sandor Szabo; Richard S. Sidle; Knut Stolze

The Blink project’s ambitious goals are to answer all Business Intelligence (BI) queries in mere seconds, regardless of the database size, with an extremely low total cost of ownership. It takes a very innovative and counter-intuitive approach to processing BI queries, one that exploits several disruptive hardware and software technology trends. Specifically, it is a new, workload-optimized DBMS aimed primarily at BI query processing, and exploits scale-out of commodity multi-core processors and cheap DRAM to retain a (copy of a) data mart completely in main memory. Additionally, it exploits proprietary compression technology and cache-conscious algorithms that reduce memory bandwidth consumption and allow most SQL query processing to be performed on the compressed data. Ignoring the general wisdom of the last three decades that the only way to scalably search large databases is with indexes, Blink always performs simple, “brute force” scans of the entire data mart in parallel on all nodes, without using any indexes or materialized views, and without any query optimizer to choose among them. The Blink technology has thus far been incorporated into two products: (1) an accelerator appliance product for DB2 for z/OS (on the “mainframe”), called the IBM Smart Analytics Optimizer for DB2 for z/OS, V1.1, which was generally available in November 2010; and (2) the Informix Warehouse Accelerator (IWA), a software-only version that was generally available in March 2011. We are now working on the next generation of Blink, called BLink Ultra, or BLU, which will significantly expand the “sweet spot” of Blink technology to much larger, disk-based warehouses and allow BLU to “own” the data, rather than copies of it.

international conference on management of data | 2016

Wildfire: Concurrent Blazing Data Ingest and Analytics

Ronald J. Barber; Matt Huras; Guy M. Lohman; C. Mohan; Rene Mueller; Fatma Ozcan; Hamid Pirahesh; Vijayshankar Raman; Richard S. Sidle; Oleg Sidorkin; Adam J. Storm; Yuanyuan Tian; Pinar Tözün

We demonstrate Hybrid Transactional and Analytics Processing (HTAP) on the Spark platform by the Wildfire prototype, which can ingest up to ~6 million inserts per second per node and simultaneously perform complex SQL analytics queries. Here, a simplified mobile application uses Wildfire to recommend advertising to mobile customers based upon their distance from stores and their interest in products sold by these stores, while continuously graphing analytics results as those customers move and respond to the ads with purchases.

Archive | 1997