Guanying Wu
Virginia Commonwealth University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guanying Wu.
european conference on computer systems | 2012
Guanying Wu; Xubin He
NAND flash-based SSDs suffer from limited lifetime due to the fact that NAND flash can only be programmed or erased for limited times. Among various approaches to address this problem, we propose to reduce the number of writes to the flash via exploiting the content locality between the write data and its corresponding old version in the flash. This content locality means, the new version, i.e., the content of a new write request, shares some extent of similarity with its old version. The information redundancy existing in the difference (delta) between the new and old data leads to a small compression ratio. The key idea of our approach, named Delta-FTL (Delta Flash Translation Layer), is to store this compressed delta in the SSD, instead of the original new data, in order to reduce the number of writes committed to the flash. This write reduction further extends the lifetime of SSDs due to less frequent garbage collection process, which is a significant write amplification factor in SSDs. Experimental results based on our Delta-FTL prototype show that Delta-FTL can significantly reduce the number of writes and garbage collection operations and thus improve SSD lifetime at a cost of trivial overhead on read latency performance.
dependable systems and networks | 2011
Chentao Wu; Xubin He; Guanying Wu; Shenggang Wan; Xiaohua Liu; Qiang Cao; Changsheng Xie
With higher reliability requirements in clusters and data centers, RAID-6 has gained popularity due to its capability to tolerate concurrent failures of any two disks, which has been shown to be of increasing importance in large scale storage systems. Among various implementations of erasure codes in RAID-6, a typical set of codes known as Maximum Distance Separable (MDS) codes aim to offer data protection against disk failures with optimal storage efficiency. However, because of the limitation of horizontal parity or diagonal/anti-diagonal parities used in MDS codes, storage systems based on RAID-6 suffers from unbalanced I/O and thus low performance and reliability. To address this issue, in this paper, we propose a new parity called Horizontal-Diagonal Parity (HDP), which takes advantages of both horizontal and diagonal/anti-diagonal parities. The corresponding MDS code, called HDP code, distributes parity elements uniformly in each disk to balance the I/O workloads. HDP also achieves high reliability via speeding up the recovery under single or double disk failure. Our analysis shows that HDP provides better balanced I/O and higher reliability compared to other popular MDS codes.
ieee conference on mass storage systems and technologies | 2010
Guanying Wu; Benjamin Eckart; Xubin He
Solid State Drives (SSDs) have shown promise to be a candidate to replace traditional hard disk drives, but due to certain physical characteristics of NAND flash, there are some challenging areas of improvement and further research. We focus on the layout and management of the small amount of RAM that serves as a cache between the SSD and the system that uses it. Of the techniques that have previously been proposed to manage this cache, we identify several sources of inefficient cache space management due to the way pages are clustered in blocks and the limited replacement policy. We develop a hybrid page/block architecture along with an advanced replacement policy, called BPAC, or Block-Page Adaptive Cache, to exploit both temporal and spatial locality. Our technique involves adaptively partitioning the SSD on-disk cache to separately hold pages with high temporal locality in a page list and clusters of pages with low temporal but high spatial locality in a block list. We run trace-driven simulations to verify our design and find that it outperforms other popular flash-aware cache schemes under different workloads.
ACM Transactions on Storage | 2012
Guanying Wu; Xubin He; Benjamin Eckart
Solid State Drives (SSDs) have shown promise to be a candidate to replace traditional hard disk drives. The benefits of SSDs over HDDs include better durability, higher performance, and lower power consumption, but due to certain physical characteristics of NAND flash, which comprise SSDs, there are some challenging areas of improvement and further research. We focus on the layout and management of the small amount of RAM that serves as a cache between the SSD and the system that uses it. Of the techniques that have previously been proposed to manage this cache, we identify several sources of inefficient cache space management due to the way pages are clustered in blocks and the limited replacement policy. We find that in many traces hot pages reside in otherwise cold blocks, and that the spatial locality of most clusters can be fully exploited in a limited time period, so we develop a hybrid page/block architecture along with an advanced replacement policy, called BPAC, or Block-Page Adaptive Cache, to exploit both temporal and spatial locality. Our technique involves adaptively partitioning the SSD on-disk cache to separately hold pages with high temporal locality in a page list and clusters of pages with low temporal but high spatial locality in a block list. In addition, we have developed a novel mechanism for flash-based SSDs to characterize the spatial locality of the disk I/O workload and an approach to dynamically identify the set of low spatial locality clusters. We run trace-driven simulations to verify our design and find that it outperforms other popular flash-aware cache schemes under different workloads. For instance, compared to a popular flash aware cache algorithm BPLRU, BPAC reduces the number of cache evictions by up to 79.6% and 34% on average.
modeling, analysis, and simulation on computer and telecommunication systems | 2010
Guanying Wu; Xubin He; Ningde Xie; Tong Zhang
This paper presents a cross-layer co-design approach to reduce SSD read response latency. The key is to cohesively exploit the NAND ???ash memory device write speed vs. raw storage reliability trade-off at the physical layer and run-time data access workload variation at the system level. Leveraging run-time data access workload variation, we can opportunistically slow down NAND ???ash memory write speed and hence improve NAND ???ash memory raw storage reliability. This naturally enables an opportunistic use of weaker error correction schemes that can directly reduce SSD read access latency. We develop a disk-level scheduling scheme to effectively smooth the write workload in order to maximize the occurrence of run-time opportunistic NAND ???ash memory write slow down. Using 2 bits/cell NAND ???ash memory with BCH-based error correction correction as a test vehicle, we carry out extensive simulations over various workloads and demonstrate that this developed cross-layer co-design solution can reduce the average SSD read latency by up to 96%.
european conference on computer systems | 2014
Ping Huang; Guanying Wu; Xubin He; Weijun Xiao
Since NAND flash cannot be updated in place, SSDs must perform all writes in pre-erased pages. Consequently, pages containing superseded data must be invalidated and garbage collected. This garbage collection adds significant cost in terms of the extra writes necessary to relocate valid pages from erasure candidates to clean blocks, causing the well-known write amplification problem. SSDs reserve a certain amount of flash space which is invisible to users, called over-provisioning space, to alleviate the write amplification problem. However, NAND blocks can support only a limited number of program/erase cycles. As blocks are retired due to exceeding the limit, the reduced size of the over-provisioning pool leads to degraded SSD performance. In this work, we propose a novel system design that we call the Smart Retirement FTL (SR-FTL) to reuse the flash blocks which have been cycled to the maximum specified P/E endurance. We take advantage of the fact that the specified P/E limit guarantees data retention time of at least one year while most active data becomes stale in a period much shorter than one year, as observed in a variety of disk workloads. Our approach aggressively manages worn blocks to store data that requires only short retention time. In the meantime, the data reliability on worn blocks is carefully guaranteed. We evaluate the SR-FTL by both simulation on an SSD simulator and prototype implementation on an OpenSSD platform. Experimental results show that the SR-FTL successfully maintains consistent over-provisioning space levels as blocks wear and thus the degree of SSD performance degradation near end-of-life. In addition, we show that our scheme reduces block wear near end-of-life by as much as 84% in some scenarios.
ACM Transactions on Design Automation of Electronic Systems | 2013
Guanying Wu; Xubin He; Ningde Xie; Tong Zhang
This article presents a cross-layer codesign approach to reduce SSD read response latency. The key is to cohesively exploit the NAND flash memory device write speed vs. raw storage reliability trade-off at the physical layer and runtime data access workload dynamics at the system level. Leveraging runtime data access workload variation, we can opportunistically slow down NAND flash memory write speed and hence improve NAND flash memory raw storage reliability. This naturally enables an opportunistic use of weaker error correction schemes that can directly reduce SSD read access latency. We develop a disk-level scheduling scheme to effectively smooth the write workload in order to maximize the occurrence of runtime opportunistic NAND flash memory write slowdown. Using 2 bits/cell NAND flash memory with BCH-based error correction correction as a test vehicle, we carry out extensive simulations over various workloads and demonstrate that this developed cross-layer co-design solution can reduce the average SSD read latency by up to 59.4% without sacrificing the write throughput performance.
southeastern symposium on system theory | 2009
Jeremy Langston; Guanying Wu; Xubin He
This paper investigates the impacts of data hot spots on performance and system availability of disk arrays. We perform experiments on RAID 5 to show quantitatively how hot-spots affect the performance in terms of I/O throughput and availability in terms of mean time to repair, MTTR. We find that for small access workloads, the performance, and system availability, is dominated by the access pattern (read/write), with negligible change due to increases in hot-spot activity. Workloads with larger accesses see a much larger performance and availability impact with increased hot-spot activity.
Journal of Systems Architecture | 2014
Guanying Wu; Ping Huang; Xubin He
Abstract In NAND flash memory, once a page program or block erase (P/E) command is issued to a NAND flash chip, the subsequent read requests have to wait until the time-consuming P/E operation to complete. Preliminary results show that the lengthy P/E operations increase the read latency by 2× on average. This increased read latency caused by the contention may significantly degrade the overall system performance. Inspired by the internal mechanism of NAND flash P/E algorithms, we propose in this paper a low-overhead P/E suspension scheme, which suspends the on-going P/E to service pending reads and resumes the suspended P/E afterwards. Having reads enjoy the highest priority, we further extend our approach by making writes be able to preempt the erase operations in order to improve the write latency performance. In our experiments, we simulate a realistic SSD model that adopts multi-chip/channel and evaluate both SLC and MLC NAND flash as storage materials of diverse performance. Experimental results show the proposed technique achieves a near-optimal performance on servicing read requests. The write latency is significantly reduced as well. Specifically, the read latency is reduced on average by 46.5% compared to RPS (Read Priority Scheduling) and when using write–suspend–erase the write latency is reduced by 13.6% relative to FIFO.
file and storage technologies | 2012
Guanying Wu; Xubin He