Yuhui Deng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuhui Deng is active.

Explore More

Publication

Featured researches published by Yuhui Deng.

ACM Computing Surveys | 2011

What is the future of disk drives, death or rebirth?

Yuhui Deng

Disk drives have experienced dramatic development to meet performance requirements since the IBM 1301 disk drive was announced in 1961. However, the performance gap between memory and disk drives has widened to 6 orders of magnitude and continues to widen by about 50% per year. Furthermore, energy efficiency has become one of the most important challenges in designing disk drive storage systems. The architectural design of disk drives has reached a turning point which should allow their performance to advance further, while still maintaining high reliability and energy efficiency. This article explains how disk drives have evolved over five decades to meet challenging customer demands. First of all, it briefly introduces the development of disk drives, and deconstructs disk performance and power consumption. Secondly, it describes the design constraints and challenges that traditional disk drives are facing. Thirdly, it presents some innovative disk drive architectures discussed in the community. Fourthly, it introduces some new storage media types and the impacts they have on the architecture of the traditional disk drives. Finally, it discusses two important evolutions of disk drives: hybrid disk and solid state disk. The article highlights the challenges and opportunities facing these storage devices, and explores how we can expect them to affect storage systems.

The Journal of Supercomputing | 2009

Ant colony optimization inspired resource discovery in P2P Grid systems

Yuhui Deng; Frank Zhigang Wang; Adrian Ciura

It is a challenge for the traditional centralized or hierarchical Grid architecture to manage the large-scale and dynamic resources, while providing scalability. The Peer-to-Peer (P2P) model offers a prospect of dynamicity, scalability, and availability of a large pool of resources. By integrating the P2P philosophy and techniques into a Grid architecture, P2P Grid system is emerging as a promising platform for executing large-scale, resource intensive applications. There are two typical resource discovery approaches for a large-scale P2P system. The first one is an unstructured approach which propagates the query messages to all nodes to locate the required resources. The method does not scale well because each individual query generates a large amount of traffic and the network quickly becomes overwhelmed by the messages. The second one is a structured approach which places resources at specified locations to make subsequent queries easier to satisfy. However, the method does not support multi-attribute range queries and may not work well in the network which has an extremely transient population. This paper proposes and designs a large-scale P2P Grid system which employs an Ant Colony Optimization (ACO) algorithm to locate the required resources. The ACO method avoids a large-scale flat flooding and supports multi-attribute range query. Multiple ants can be employed to improve the parallelism of the method. A simulator is developed to evaluate the proposed resource discovery mechanism. Comprehensive simulation results validate the effectiveness of the proposed method compared with the traditional unstructured and structured approaches.

Journal of Systems Architecture | 2011

Architectures and optimization methods of flash memory based storage systems

Yuhui Deng; Jipeng Zhou

Flash memory is a non-volatile memory which can be electrically erased and reprogrammed. Its major advantages such as small physical size, no mechanical components, low power consumption, and high performance have made it likely to replace the magnetic disk drives in more and more systems. However, flash memory has four specific features which are different to the magnetic disk drives, and pose challenges to develop practical techniques: (1) Flash memory is erased in blocks, but written in pages. (2) A block has to be erased before writing data to the block. (3) A block of flash memory can only be written for a specified number of times. (4) Writing pages within a block should be done sequentially. This survey presents the architectures, technologies, and optimization methods employed by the existing flash memory based storage systems to tackle the challenges. I hope that this paper will encourage researchers to analyze, optimize, and develop practical techniques to improve the performance and reduce the energy consumption of flash memory based storage systems, by leveraging the existing methods and solutions.

Operating Systems Review | 2007

A heterogeneous storage grid enabled by grid service

Yuhui Deng; Frank Zhigang Wang

Due to the explosive increase of data, storage Grid is a new model for deploying and managing storage resources distributed across multiple systems and networks, making efficient use of available storage capacity. Building a storage Grid demands corresponding protocols and standards to provide interoperability among the large number of heterogeneous storage systems. Service is becoming a basic application pattern of Grid because the service offers a standard means of interoperating between different applications running on a variety of platforms. This paper proposes a storage Grid architecture that wraps all distributed and heterogeneous storage resources into Grid services to provide transparent, remote, and on demand data access. The storage oriented Grid service can be considered as a basic building block of an infinite storage pool which provides good scalability through its inherent parallelism, and facilitates simple incremental resource expansion (to add storage resources, one just adds storage services). Grid users can stack simple modular storage service piece by piece as demand grows instead of buying monolithic storage systems. An implemented proof-of-concept prototype validates that the storage Grid architecture trade 5% (at most) performance degradation for an infinite and heterogeneous storage pool.

Information Sciences | 2008

EED: Energy Efficient Disk drive architecture

Yuhui Deng; Frank Zhigang Wang; Na Helian

Energy efficiency has become one of the most important challenges in designing future computing systems, and the storage system is one of the largest energy consumers within them. This paper proposes an Energy Efficient Disk (EED) drive architecture which integrates a relatively small-sized NAND flash memory into a traditional disk drive to explore the impact of the flash memory on the performance and energy consumption of the disk. The EED monitors data access patterns and moves the frequently accessed data from the magnetic disk to the flash memory. Due to the data migration, most of the data accesses can be satisfied with the flash memory, which extends the idle period of the disk drive and enables the disk drive to stay in a low power state for an extended period of time. Because flash memory consumes considerably less energy and the read access is much faster than a magnetic disk, the EED can save significant amounts of energy while reducing the average response time. Real trace driven simulations are employed to validate the proposed disk drive architecture. An energy coefficient, which is the product of the average response time and the average energy consumption, is proposed as a performance metric to measure the EED. The simulation results, along with the energy coefficient, show that the EED can achieve an 89.11% energy consumption reduction and a 2.04% average response time reduction with cello99 trace, a 7.5% energy consumption reduction and a 45.15% average response time reduction with cello96 trace, and a 20.06% energy consumption reduction and a 6.02% average response time reduction with TPC-D trace, respectively. Traditionally, energy conservation and performance improvement are contradictory. The EED strikes a good balance between conserving energy and improving performance.

IEEE Transactions on Computers | 2007

Grid-Oriented Storage: A Single-Image, Cross-Domain, High-Bandwidth Architecture

Frank Zhigang Wang; Sining Wu; Na Helian; Michael Andrew Parker; Yike Guo; Yuhui Deng; Vineet R. Khare

This paper describes the grid-oriented storage (GOS) architecture and its implementations. A GOS-specific file system (GOS-FS), the single-purpose intent of a GOS OS, and secure interfaces via grid security infrastructure (GSI) motivate and enable this new architecture. As an FTP server, GOS with a slimmed OS, with a total volume of around 150 MB, outperforms the standard GridFTP by 20-40 percent. As a file server, GOS-FS acts as a network/grid interface, enabling a user to perform searches and access resources without downloading them locally. In the real-world tests between Cambridge and Beijing, where the transfer distance is 10,000 km, the multistreamed GOS-FS file opening/saving resulted in a remarkable performance increase of about 2-25 times, compared to the single-streamed network file system (NFSv4). GOS is expected to be a variant of or successor to the well-used network-attached storage (NAS) and/or storage area network (SAN) products in the grid era

parallel computing | 2008

Dynamic and scalable storage management architecture for Grid Oriented Storage devices

Yuhui Deng; Frank Zhigang Wang; Na Helian; Sining Wu; Chenhan Liao

Most of currently deployed Grid systems employ hierarchical or centralized approaches to simplify system management. However, the approaches cannot satisfy the requirements of complex Grid applications which involve hundreds or thousands of geographically distributed nodes. This paper proposes a Dynamic and Scalable Storage Management (DSSM) architecture for Grid Oriented Storage (GOS) devices. Since large-scale data intensive applications frequently involve a high degree of data access locality, the DSSM divides GOS nodes into multiple geographically distributed domains to facilitate the locality and simplify the intra-domain storage management. Dynamic GOS agents selected from the domains are organized as a virtual agent domain in a Peer-to-Peer (P2P) manner to coordinate multiple domains. As only the domain agents participate in the inter-domain communication, system wide information dissemination can be done far more efficiently than flat flooding. Grid service based storage resources are adopted to stack simple modular service piece by piece as demand grows. The decentralized architecture of DSSM avoids the hierarchical or centralized approaches of traditional Grid architectures, eliminates large-scale flat flooding of unstructured P2P systems, and provides an interoperable, seamless, and infinite storage pool in a Grid environment. The DSSM architecture is validated by a proof-of-concept prototype system.

Information Sciences | 2009

Exploiting the performance gains of modern disk drives by enhancing data locality

Yuhui Deng

Due to the widening performance gap between RAM and disk drives, a large number of I/O optimization methods have been proposed and designed to alleviate the impact of this gap. One of the most effective approaches of improving disk access performance is enhancing data locality. This is because the method could increase the hit ratio of disk cache and reduce the seek time and rotational latency. Disk drives have experienced dramatic development since the first disk drive was announced in 1956. This paper investigates some important characteristics of modern disk drives. Based on the characteristics and the observation that data access on disk drives is highly skewed, the frequently accessed data blocks and the correlated data blocks are clustered into objects and moved to the outer zones of a modern disk drive. The idea attempts to enhance spatial locality, improve the efficiency of aggressive sequential prefetch, and take advantage of Zoned Bit Recording (ZBR). An experimental simulation is employed to investigate the performance gains generated by the enhanced data locality. The performance gains are analyzed by breaking down the disk access time into seek time, rotational latency, data transfer time, and hit ratio of the disk cache. Experimental results provide useful insights into the performance behaviours of a modern disk drive with enhanced data locality.

Journal of Network and Computer Applications | 2015

Skewly replicating hot data to construct a power-efficient storage cluster

Lingwei Zhang; Yuhui Deng; Weiheng Zhu; Jipeng Zhou; Frank Zhigang Wang

The exponential data growth is presenting challenges to traditional storage systems. Component-based cluster storage systems, due to their high scalability, are becoming the architecture of next generation storage systems. Cluster storage systems often use data replication to ensure high availability, fault tolerance, and load balance. However, this kind of data replication not only consumes a large amount of storage resources, but also generates more energy consumption. This paper presents a power-aware data replication strategy by leveraging data access behavior. This strategy uses 80/20 rule (80% of the data accesses often go to 20% of the storage space) to skewly replicate only a small amount of frequently accessed data. Furthermore, the storage nodes are divided into a hot node set and a cold node set. Hot nodes, which store a small amount of hot data copies, are always in an active state to guarantee the QoS of the system. The cold nodes which store a large volume of infrequently accessed cold data are placed in a low-power state, thus reducing the energy consumption of the cluster storage system. Simulation results show that the proposed strategy can effectively reduce the resource and energy consumption of the system, while ensuring system performance.

ieee international conference on cloud computing technology and science | 2011

Conserving disk energy in virtual machine based environments by amplifying bursts

Yuhui Deng; Brandon Pung

Computer systems are now powerful enough to run multiple virtual machines (VMs), each one running a separate operating system (OS) instance. In such an environment, direct and centralized energy management employed by a single OS is unfeasible. Accurately predicting the idle intervals is one of the major approaches to save energy of disk drives. However, for the intensive workloads, it is difficult to find long idle intervals. Even if long idle intervals exist, it is very difficult for a predictor to catch the idle spikes in the workloads. This paper proposes to divide the workloads into buckets which are equal in time length, and predict the number of the forthcoming requests in each bucket instead of the length of the idle periods. By doing so, the bucket method makes the converted workload more predictable. The method also squeezes the executing time of each request to the end of its respective bucket, thus extending the idle length. By deliberately reshaping the workloads such that the crests and troughs of each workload become aligned, we can aggregate the peaks and the idle periods of the workloads. Due to the extended idle length caused by this aggregation, energy can be conserved. Furthermore, as a result of aligning the peaks, resource utilization is improved when the system is active. A trace driven simulator is designed to evaluate the idea. Three traces are employed to represent the workloads issued by three web servers residing in three VMs. The experimental results show that our method can save significant amounts of energy by sacrificing a small amount of quality of service.

Explore More