David C. Sharpe | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David C. Sharpe is active.

Explore More

Publication

Featured researches published by David C. Sharpe.

very large data bases | 2013

DB2 with BLU acceleration: so much more than just a column store

Vijayshankar Raman; Gopi K. Attaluri; Ronald J. Barber; Naresh K. Chainani; David Kalmuk; Vincent Kulandaisamy; Jens Leenstra; Sam Lightstone; Shaorong Liu; Guy M. Lohman; Tim R Malkemus; Rene Mueller; Ippokratis Pandis; Berni Schiefer; David C. Sharpe; Richard S. Sidle; Adam J. Storm; Liping Zhang

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLUs hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.

very large data bases | 2014

Memory-efficient hash joins

Ronald J. Barber; Guy M. Lohman; Ippokratis Pandis; Vijayshankar Raman; Richard S. Sidle; Gopi K. Attaluri; Naresh K. Chainani; Sam Lightstone; David C. Sharpe

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded population counts to almost entirely avoid collisions. This bitmap also serves as a Bloom filter for use in multi-table joins. We study the random access characteristics of hash joins, and renew the case for non-partitioned hash joins. We introduce a variant of partitioned joins in which only the build is partitioned, but the probe is not, as this is more efficient for large outer tables than traditional partitioned joins. This also avoids partitioning costs during the probe, while at the same time allowing parallel build without latching overheads. Additionally, we present a variant of CHT, called a concise array table (CAT), that can be used when the key domain is moderately dense. CAT is collision-free and avoids storing join keys in the hash table. We perform a detailed comparison of CHT and CAT against leading in-memory hash joins. Our experiments show that we can reduce the memory usage by one to three orders of magnitude, while also being competitive in performance.

Archive | 2004