Puyuan Yang
University of Science and Technology of China
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Puyuan Yang.
database systems for advanced applications | 2011
Puyuan Yang; Peiquan Jin; Lihua Yue
Recently, flash-memory-based solid state disks (SSDs) have been considered to be alternatives for traditional magnetic disks. However, it has not come true so far due to some limitations on SSDs, such as high latency of write operation and low reliability in case of unbalanced erasure. Therefore, a practical way is to integrate SSD and magnetic disk and then to obtain a better tradeoff between those two storage medium. In this paper, we investigate the issues of integrating SSD and disk in the storage layer of a database management system. In particular, we propose a new approach to using a magnetic disk as the write cache of an SSD, in which each data page is placed either in disk or in SSD. To find an optimal page placement scheme, we first propose a page migration model, which uses two grains, namely page and block (a set of pages), to perform the migration between SSD and disk. Based on this model, we develop an online approach to determining the optimal places of data pages. We conduct experiments on tailor-made traces to measure the performance of our hybrid storage approach. The results show that our approach ensures most read operations are performed on SSD and most write operations are focused on disk. Meanwhile, our hybrid approach has less runtime than the single-disk-based mechanism.
Chinese Journal of Computers | 2012
Puyuan Yang; Peiquan Jin; Lihua Yue
Solid state driver(SSD) based on flash memory has become a persistent storage device widely used.But it can not take place the magnetic disks absolutely due to the imbalance I/O character and the high price of flash memory.The hybrid storage system consisted of SSD and HDD is gradually becoming the research issue.For the hybrid storage with SSD and HDD,this paper proposes a time-sensitive hybrid storage model to efficiently take use of SSD.This model takes SSD and HDD at the same level of the memory hierarchy.According to the page access counter and access temperature,the model achieves accurate page classification and placement which places the hot and read-intensive pages SSD and places the cold or write-intensive pages to HDD.So the model takes advantage of the asymmetric I/O properties of HDD and SSD to reduce total I/O latency of system.We separately realizes the model on the hybrid storage based on high-end SSD and middle-end SSD and finish the performance evaluation.The experimental result shows that our model can achieve more accurate page classification,reduce the migration cost and get obvious performance improvement with fewer SSD.
Distributed and Parallel Databases | 2015
Peiquan Jin; Puyuan Yang; Lihua Yue
Flash-memory-based solid state drives (SSD) have been widely used in computer systems. Due to the high price and some specific features of SSD such as asymmetric read/write speeds and limited erasure endurance, it has been a very common solution, e.g., in modern data centers, to use hybrid storage systems involving SSD and traditional hard disks (HDD). However, the SSD/HDD-based hybrid storage systems introduce some new problems in the indexing schemes for data management. In this paper, we propose a new B+-tree-based index for such hybrid storage systems, which is called HybridB tree. The HybridBtree aims to reduce the random writes to SSD while keeping high time performance and low buffer costs. Particularly, we introduce a new design called huge leaf to avoid the splits and merges on B+-tree. A huge leaf node contains two or more leaf nodes in different states. We place the leaf nodes on HDD or SSD according to their current states, and dynamically adapt the states of leaf nodes when they are read or updated. After a detailed explanation on the structure and operations of the HybridB tree, we give a theoretical analysis on the costs of the HybridB tree. Then, we conduct experiments on two TPC-C traces, using a real hybrid storage system including one HDD and two SSDs, and compare the performance of our proposal with two implementations of B+-tree, namely the B+-tree on HDD and the B+-tree on SSD/HDD. The results show that our proposal has the best time performance and the fewest buffer costs. Moreover, our proposal is able to effectively reduce the random writes to SSD.
web age information management | 2013
Puyuan Yang; Peiquan Jin; Shouhong Wan; Lihua Yue
In recent years, flash memory based storage device SSDs (solid state drives) have been regarded as the storage devices of next generation to replace HDDs (hard disk drivers). However, the high price of SSDs, especially those with high performance, results in the situation that SSDs and HDDS are both popularly used in real applications. In order to integrate the merits of SSDs and HDDS, it has become a hot research topic that using HDDs for SSDs to construct a hybrid storage system. The goal of this paper is to use the cheap low-end SSD and HDD to build a hybrid storage system with high efficiency, which is called HB-Storage. HB-Storage considers the characters of SSDs and HDDs, and builds a HDD write buffer to optimize the SSD write request. The write buffer is designed based on the data access load statistics. As a consequence, HB-Storage can utilize the higher read performance of SSDs, and can also improve the random write latency of SSDs. The experimental results show that HB-Storage can maintain a high read performance and significantly reduce the write requests on the SSD, and thus has higher overall performance.
Frontiers of Computer Science in China | 2014
Ke Lu; Peiquan Jin; Puyuan Yang; Shouhong Wan; Lihua Yue
Flash memory is widely used in embedded devices and enterprise storage systems. Currently, flash-based storage devices usually use a flash translation layer (FTL) to cope with the special features of flash memory. Many methods for the design and implementation of the FTL have been proposed, such as BAST (block-associative sector translation), FAST (fully associative sector translation), and IPL (inpage logging), of which IPL has been demonstrated to have the best performance. However, IPL offers little consideration to reducing merge operations that consequently result in the degradation of the overall performance of flash-memory storage systems. We propose an improvement to IPL, called adaptive IPL (AIPL). The idea of adaptive IPL is to make the log region in a block resizable, therefore a hot block (i.e., a write-intensive block) will use a large log region so as to absorb more page updates and in turn reduce the merge operations, while a cold block, i.e., a block rarely written to, will use a small log region. This is realized by first detecting the update pattern of a block and then presenting an updatepattern-based algorithm to dynamically adjust the log region size of a newly allocated block. We conduct experiments on TPC-C traces and synthetic traces and compare the performance of AIPL with other competitors in terms of merge count, write count and elapsed time. The results demonstrate that compared with IPL, AIPL can reduce merge operations by 65% and write operations by 54% on average.
very large data bases | 2016
Peiquan Jin; Chengcheng Yang; Christian S. Jensen; Puyuan Yang; Lihua Yue
Flash-memory-based solid-state drives (SSDs) are used widely for secondary storage. To be effective for SSDs, traditional indices have to be redesigned to cope with the special properties of flash memory, such as asymmetric read/write latencies (fast reads and slow writes) and out-of-place updates. Previous flash-optimized indices focus mainly on reducing random writes to SSDs, which is typically accomplished at the expense of a substantial number of extra reads. However, modern SSDs show a narrowing gap between read and write speeds, and read operations on SSDs increasingly affect the overall performance of indices on SSDs. As a consequence, how to optimize SSD-aware indices by reducing both write and read costs is a pertinent and open challenge. We propose a new tree index for SSDs that is able to reduce both writes and extra reads. In particular, we use an update buffer and overflow pages to reduce random writes, and we further exploit Bloom filters to reduce the extra reads to the overflow nodes in the tree. With this mechanism, we construct a read/write-optimized index that is capable of offering better overall performance than previous flash-aware indices. In addition, we present an analysis of the proposed index and show that the read and write costs of the operations on the index can be balanced by only tuning the false-positive rate of the Bloom filters. Our experimental results suggest that our proposal is efficient and represents an improvement over existing methods.
International Journal of Parallel Programming | 2016
Chengcheng Yang; Peiquan Jin; Lihua Yue; Puyuan Yang
Recently, with the widely use of flash memory based solid state drives (SSDs), a lot of studies have been conducted on SSD-based data management, such as index structures, query processing, and buffer management schemes. This paper focuses on buffer schemes for SSD-based database systems. However, differing from previous studies, we concentrate on buffer schemes for tree indexes on SSDs. This work is motivated by the observation that access patterns on index pages are much different from those on data pages. Generally, in a typical tree index, e.g., B+-tree, the root and internal nodes have higher read frequencies than leaf nodes have. However, traditional SSD-oriented buffering methods do not consider this special feature of indexes, and thus is not efficient when used as index buffer management schemes. In this paper, we present a new buffering scheme for tree indexes on SSDs that is named Clean-First and Dirty-Redundant-Write (CFDRW) scheme. The contributions of CFDRW are manifold. First, it assigns priorities to index pages to reflect the differences of access patterns of the nodes in a tree index. Second, it uses priority and recency to detect the hotness of index pages and proposes a new replacement algorithm based on priority and recency of the buffered index pages. Third, it exploits the internal parallelism of SSDs and proposes to write out buffer pages in a coarse granularity, i.e., to write out several pages using one physical I/O operation. We compare our proposal on two commodity SSDs with many previous methods including LRU, LIRS, and six flash-memory-based buffering schemes, by using synthetic and real workloads. The results show that our proposal outperforms the competitors under all workloads and SSDs in terms of various metrics including hit ratio, read count, write count, and elapsed time.
International Journal of Digital Content Technology and Its Applications | 2010
Hui Zhao; Peiquan Jin; Puyuan Yang; Lihua Yue
Journal of Universal Computer Science | 2014
Puyuan Yang; Peiquan Jin; Lihua Yue
Archive | 2014
Puyuan Yang; Peiquan Jin; Lihua Yue