Ronald J. Barber | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ronald J. Barber is active.

Explore More

Publication

Featured researches published by Ronald J. Barber.

very large data bases | 2013

DB2 with BLU acceleration: so much more than just a column store

Vijayshankar Raman; Gopi K. Attaluri; Ronald J. Barber; Naresh K. Chainani; David Kalmuk; Vincent Kulandaisamy; Jens Leenstra; Sam Lightstone; Shaorong Liu; Guy M. Lohman; Tim R Malkemus; Rene Mueller; Ippokratis Pandis; Berni Schiefer; David C. Sharpe; Richard S. Sidle; Adam J. Storm; Liping Zhang

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLUs hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.

very large data bases | 2014

Memory-efficient hash joins

Ronald J. Barber; Guy M. Lohman; Ippokratis Pandis; Vijayshankar Raman; Richard S. Sidle; Gopi K. Attaluri; Naresh K. Chainani; Sam Lightstone; David C. Sharpe

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded population counts to almost entirely avoid collisions. This bitmap also serves as a Bloom filter for use in multi-table joins. We study the random access characteristics of hash joins, and renew the case for non-partitioned hash joins. We introduce a variant of partitioned joins in which only the build is partitioned, but the probe is not, as this is more efficient for large outer tables than traditional partitioned joins. This also avoids partitioning costs during the probe, while at the same time allowing parallel build without latching overheads. Additionally, we present a variant of CHT, called a concise array table (CAT), that can be used when the key domain is moderately dense. CAT is collision-free and avoids storing join keys in the hash table. We perform a detailed comparison of CHT and CAT against leading in-memory hash joins. Our experiments show that we can reduce the memory usage by one to three orders of magnitude, while also being competitive in performance.

international conference on data engineering | 2015

In-memory BLU acceleration in IBM's DB2 and dashDB: Optimized for modern workloads and hardware architectures

Ronald J. Barber; Guy M. Lohman; Vijayshankar Raman; Richard S. Sidle; Sam Lightstone; Berni Schiefer

Although the DRAM for main memories of systems continues to grow exponentially according to Moores Law and to become less expensive, we argue that memory hierarchies will always exist for many reasons, both economic and practical, and in particular due to concurrent users competing for working memory to perform joins and grouping. We present the in-memory BLU Acceleration used in IBMs DB2 for Linux, UNIX, and Windows, and now also the dashDB cloud offering, which was designed and implemented from the ground up to exploit main memory but is not limited to what fits in memory and does not require manual management of what to retain in memory, as its competitors do. In fact, BLU Acceleration views memory as too slow, and is carefully engineered to work in higher levels of the system cache by keeping the data encoded and packed densely into bit-aligned vectors that can exploit SIMD instructions in processing queries. To achieve scalable multi-core parallelism, BLU assigns to each thread independent data structures, or partitions thereof, designed to have low synchronization costs, and doles out batches of values to threads. On customer workloads, BLU has improved performance on complex analytics queries by 10 to 50 times, compared to the legacy row-organized run-time, while also significantly simplifying database administration, shortening time to value, and improving data compression. UPDATE and DELETE performance was improved by up to 112 times with the new Cancun release of DB2 with BLU Acceleration, which also added Shadow Tables for high performance on mixed OLTP and BI analytics workloads, and extended DB2s High Availability Disaster Recovery (HADR) and SQL compatibility features to BLUs column-organized tables.

business intelligence for the real time enterprises | 2011

Blink: Not Your Father's Database!

Ronald J. Barber; Peter Bendel; Marco Czech; Oliver Draese; Frederick Ho; Namik Hrle; Stratos Idreos; Min-Soo Kim; Oliver Koeth; Jae-Gil Lee; Tianchao Tim Li; Guy M. Lohman; Konstantinos Morfonios; Rene Mueller; Keshava Murthy; Ippokratis Pandis; Lin Qiao; Vijayshankar Raman; Sandor Szabo; Richard S. Sidle; Knut Stolze

The Blink project’s ambitious goals are to answer all Business Intelligence (BI) queries in mere seconds, regardless of the database size, with an extremely low total cost of ownership. It takes a very innovative and counter-intuitive approach to processing BI queries, one that exploits several disruptive hardware and software technology trends. Specifically, it is a new, workload-optimized DBMS aimed primarily at BI query processing, and exploits scale-out of commodity multi-core processors and cheap DRAM to retain a (copy of a) data mart completely in main memory. Additionally, it exploits proprietary compression technology and cache-conscious algorithms that reduce memory bandwidth consumption and allow most SQL query processing to be performed on the compressed data. Ignoring the general wisdom of the last three decades that the only way to scalably search large databases is with indexes, Blink always performs simple, “brute force” scans of the entire data mart in parallel on all nodes, without using any indexes or materialized views, and without any query optimizer to choose among them. The Blink technology has thus far been incorporated into two products: (1) an accelerator appliance product for DB2 for z/OS (on the “mainframe”), called the IBM Smart Analytics Optimizer for DB2 for z/OS, V1.1, which was generally available in November 2010; and (2) the Informix Warehouse Accelerator (IWA), a software-only version that was generally available in March 2011. We are now working on the next generation of Blink, called BLink Ultra, or BLU, which will significantly expand the “sweet spot” of Blink technology to much larger, disk-based warehouses and allow BLU to “own” the data, rather than copies of it.

international conference on management of data | 2016

Wildfire: Concurrent Blazing Data Ingest and Analytics

Ronald J. Barber; Matt Huras; Guy M. Lohman; C. Mohan; Rene Mueller; Fatma Ozcan; Hamid Pirahesh; Vijayshankar Raman; Richard S. Sidle; Oleg Sidorkin; Adam J. Storm; Yuanyuan Tian; Pinar Tözün

We demonstrate Hybrid Transactional and Analytics Processing (HTAP) on the Spark platform by the Wildfire prototype, which can ingest up to ~6 million inserts per second per node and simultaneously perform complex SQL analytics queries. Here, a simplified mobile application uses Wildfire to recommend advertising to mobile customers based upon their distance from stores and their interest in products sold by these stores, while continuously graphing analytics results as those customers move and respond to the ads with purchases.

symposium on cloud computing | 2013

Go, server, go!: parallel computing with moving servers

Ronald J. Barber; Guy M. Lohman; Rene Mueller; Ippokratis Pandis; Vijayshankar Raman; W. Wilcke

In data centers today, servers are stationary and data flows on a hierarchical network of switches and routers. But such static server arrangements require very scalable networks, and many applications are bottlenecked by network bandwidth. In addition, server density is kept low to enable maintenance and upgrades, as well as to increase air flow. In this paper, we propose a design in which servers move physically, and communicate via point-to-point connections (instead of switches). We argue that this allows data transfer bandwidth to scale linearly with the number of servers, and that moving servers is not as expensive as it sounds, at least in terms of power consumption. Moreover, while servers move around, they regularly reach the perimeters of the system, which helps with heat dissipation and with servicing of failed nodes. This design also helps in traditional switch-based networks, to improve density and maintainability.

Archive | 1994