2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC) | 2019

Emulate Processing of Assorted Database Server Applications on Flash-Based Storage in Datacenter Infrastructures

 
 
 
 

Abstract


In the era of big data processing, more and more datacenters in cloud storages are now replacing traditional HDDs with enterprise SSDs. Both developers and users of these SSDs require thorough benchmarking to evaluate their performance impacts. I/O performance with synthetic workload or classic benchmark varies drastically from real I/O activities in the datacenter. Thus, we propose a new framework, called Pattern I/O generator (PatIO), to collectively capture the enterprise storage behavior that is prevailing across assorted user workloads and system configurations for different database server applications on flash-based storage. PatIO is designed to emulate the processing of real-world I/O activities easily with less time and resource requirements. Our methodology comprises three main steps: (1) dissect the overall I/O activities of various real workloads and identify the prevailing attributes in distinct visual I/O patterns; (2) construct a pattern warehouse as the collection of unique I/O patterns that are generated through various combinations of multiple I/O jobs; and (3) finally integrate different combinations of these synthetically generated I/O patterns to reproduce the comprehensive characteristics of various real workloads and system setup for the database server applications. To provide an easy-to-use experience, we develop a graphical user interface (GUI). We evaluate our framework by comparing I/O characteristics and I/O performance of generated workloads with those of real-world workloads for multiple database applications such as MySQL, Cassandra, and ForestDB.

Volume None
Pages 1-8
DOI 10.1109/IPCCC47392.2019.8958744
Language English
Journal 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC)

Full Text