Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xubin He is active.

Publication


Featured researches published by Xubin He.


international conference on cluster computing | 2009

Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS

Xuhui Liu; Jizhong Han; Yunqin Zhong; Chengde Han; Xubin He

Hadoop framework has been widely used in various clusters to build large scale, high performance systems. However, Hadoop distributed file system (HDFS) is designed to manage large files and suffers performance penalty while managing a large amount of small files. As a consequence, many web applications, like WebGIS, may not take benefits from Hadoop. In this paper, we propose an approach to optimize I/O performance of small files on HDFS. The basic idea is to combine small files into large ones to reduce the file number and build index for each file. Furthermore, some novel features such as grouping neighboring files and reserving several latest version of data are considered to meet the characteristics of WebGIS access patterns. Preliminary experiment results show that our approach achieves better performance.


european conference on computer systems | 2012

Delta-FTL: improving SSD lifetime via exploiting content locality

Guanying Wu; Xubin He

NAND flash-based SSDs suffer from limited lifetime due to the fact that NAND flash can only be programmed or erased for limited times. Among various approaches to address this problem, we propose to reduce the number of writes to the flash via exploiting the content locality between the write data and its corresponding old version in the flash. This content locality means, the new version, i.e., the content of a new write request, shares some extent of similarity with its old version. The information redundancy existing in the difference (delta) between the new and old data leads to a small compression ratio. The key idea of our approach, named Delta-FTL (Delta Flash Translation Layer), is to store this compressed delta in the SSD, instead of the original new data, in order to reduce the number of writes committed to the flash. This write reduction further extends the lifetime of SSDs due to less frequent garbage collection process, which is a significant write amplification factor in SSDs. Experimental results based on our Delta-FTL prototype show that Delta-FTL can significantly reduce the number of writes and garbage collection operations and thus improve SSD lifetime at a cost of trivial overhead on read latency performance.


dependable systems and networks | 2011

HDP code: A Horizontal-Diagonal Parity Code to Optimize I/O load balancing in RAID-6

Chentao Wu; Xubin He; Guanying Wu; Shenggang Wan; Xiaohua Liu; Qiang Cao; Changsheng Xie

With higher reliability requirements in clusters and data centers, RAID-6 has gained popularity due to its capability to tolerate concurrent failures of any two disks, which has been shown to be of increasing importance in large scale storage systems. Among various implementations of erasure codes in RAID-6, a typical set of codes known as Maximum Distance Separable (MDS) codes aim to offer data protection against disk failures with optimal storage efficiency. However, because of the limitation of horizontal parity or diagonal/anti-diagonal parities used in MDS codes, storage systems based on RAID-6 suffers from unbalanced I/O and thus low performance and reliability. To address this issue, in this paper, we propose a new parity called Horizontal-Diagonal Parity (HDP), which takes advantages of both horizontal and diagonal/anti-diagonal parities. The corresponding MDS code, called HDP code, distributes parity elements uniformly in each disk to balance the I/O workloads. HDP also achieves high reliability via speeding up the recovery under single or double disk failure. Our analysis shows that HDP provides better balanced I/O and higher reliability compared to other popular MDS codes.


ieee conference on mass storage systems and technologies | 2010

BPAC: An adaptive write buffer management scheme for flash-based Solid State Drives

Guanying Wu; Benjamin Eckart; Xubin He

Solid State Drives (SSDs) have shown promise to be a candidate to replace traditional hard disk drives, but due to certain physical characteristics of NAND flash, there are some challenging areas of improvement and further research. We focus on the layout and management of the small amount of RAM that serves as a cache between the SSD and the system that uses it. Of the techniques that have previously been proposed to manage this cache, we identify several sources of inefficient cache space management due to the way pages are clustered in blocks and the limited replacement policy. We develop a hybrid page/block architecture along with an advanced replacement policy, called BPAC, or Block-Page Adaptive Cache, to exploit both temporal and spatial locality. Our technique involves adaptively partitioning the SSD on-disk cache to separately hold pages with high temporal locality in a page list and clusters of pages with low temporal but high spatial locality in a block list. We run trace-driven simulations to verify our design and find that it outperforms other popular flash-aware cache schemes under different workloads.


international parallel and distributed processing symposium | 2011

H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6

Chentao Wu; Shenggang Wan; Xubin He; Qiang Cao; Changsheng Xie

RAID-6 is widely used to tolerate concurrent failures of any two disks to provide a higher level of reliability with the support of erasure codes. Among many implementations, one class of codes called {\bfseries{M}}aximum {\bfseries{D}}istance {\bfseries{S}}eparable ({\bfseries{MDS}}) codes aims to offer data protection against disk failures with optimal storage efficiency. Typical MDS codes contain horizontal and vertical codes. Due to the horizontal parity, in the case of \emph{partial stripe write} (refers to I/O operations that write new data or update data to a subset of disks in an array) in a row, horizontal codes may get less I/O operations in most cases, but suffer from unbalanced I/O distribution. They also have limitation on high single write complexity. Vertical codes improve single write complexity compared to horizontal codes, while they still suffer from poor performance in partial stripe writes. In this paper, we propose a new XOR-based MDS array code, named Hybrid Code (H-Code), which optimizes partial stripe writes for RAID-6 by taking advantages of both horizontal and vertical codes. H-Code is a solution for an array of


local computer networks | 2002

A caching strategy to improve iSCSI performance

Xubin He; Qing Yang; Ming Zhang

(p+1)


ACM Transactions on Storage | 2012

An adaptive write buffer management scheme for flash-based SSDs

Guanying Wu; Xubin He; Benjamin Eckart

disks, where


Journal of Computers | 2006

Symmetric Active/Active High Availability for High-Performance Computing System Services

Christian Engelmann; Stephen L. Scott; Chokchai Leangsuksun; Xubin He

p


storage network architecture and parallel i/os | 2003

Performance evaluation of distributed iSCSI RAID

Xubin He; Praveen Beedanagari; Dan Zhou

is a prime number. Unlike other codes taking a dedicated anti-diagonal parity strip, H-Code uses a special anti-diagonal parity layout and distributes the anti-diagonal parity elements among disks in the array, which achieves a more balanced I/O distribution. On the other hand, the horizontal parity of H-Code ensures a partial stripe write to continuous data elements in a row share the same row parity chain, which can achieve optimal partial stripe write performance. Not only within a row but also within a stripe, H-Code offers optimal partial stripe write complexity to two continuous data elements and optimal partial stripe write performance among all MDS codes to the best of our knowledge. Specifically, compared to RDP and EVENODD codes, H-Code reduces I/O cost by up to


modeling, analysis, and simulation on computer and telecommunication systems | 2010

DiffECC: Improving SSD Read Performance Using Differentiated Error Correction Coding Schemes

Guanying Wu; Xubin He; Ningde Xie; Tong Zhang

15.54%

Collaboration


Dive into the Xubin He's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Changsheng Xie

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Stephen L. Scott

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Christian Engelmann

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Xin Chen

Tennessee Technological University

View shared research outputs
Top Co-Authors

Avatar

Chentao Wu

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Shenggang Wan

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Qiang Cao

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Ming Zhang

University of Rhode Island

View shared research outputs
Top Co-Authors

Avatar

Guanying Wu

Virginia Commonwealth University

View shared research outputs
Researchain Logo
Decentralizing Knowledge