Is this you? Create Your Porfile

Lingfang Zeng

Huazhong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lingfang Zeng is active.

Explore More

Publication

Featured researches published by Lingfang Zeng.

international conference on cluster computing | 2010

CDRM: A Cost-Effective Dynamic Replication Management Scheme for Cloud Storage Cluster

Qingsong Wei; Bharadwaj Veeravalli; Bozhao Gong; Lingfang Zeng; Dan Feng

Data replication has been widely used as a mean of increasing the data availability of large-scale cloud storage systems where failures are normal. Aiming to provide cost-effective availability, and improve performance and load-balancing of cloud storage, this paper presents a cost-effective dynamic replication management scheme referred to as CDRM. A novel model is proposed to capture the relationship between availability and replica number. CDRM leverages this model to calculate and maintain minimal replica number for a given availability requirement. Replica placement is based on capacity and blocking probability of data nodes. By adjusting replica number and location according to workload changing and node capacity, CDRM can dynamically redistribute workloads among data nodes in the heterogeneous cloud. We implemented CDRM in Hadoop Distributed File System (HDFS) and experiment results conclusively demonstrate that our CDRM is cost effective and outperforms default replication management of HDFS in terms of performance and load balancing for large-scale cloud storage.

international parallel and distributed processing symposium | 2010

HPDA: A hybrid parity-based disk array for enhanced performance and reliability

Bo Mao; Hong Jiang; Dan Feng; Suzhen Wu; Jianxi Chen; Lingfang Zeng; Lei Tian

A single flash-based Solid State Drive (SSD) can not satisfy the capacity, performance and reliability requirements of a modern storage system supporting increasingly demanding data-intensive computing applications. Applying RAID schemes to SSDs to meet these requirements, while a logical and viable solution, faces many challenges. In this paper, we propose a Hybrid Parity-based Disk Array architecture, HPDA, which combines a group of SSDs and two hard disk drives (HDDs) to improve the performance and reliability of SSD-based storage systems. In HPDA, the SSDs (data disks) and part of one HDD (parity disk) compose a RAID4 disk array. Meanwhile, a second HDD and the free space of the parity disk are mirrored to form a RAID1-style write buffer that temporarily absorbs the small write requests and acts as a surrogate set during recovery when a disk fails. The write data is reclaimed back to the data disks during the lightly loaded or idle periods of the system. Reliability analysis shows that the reliability of HPDA, in terms of MTTDL (Mean Time To Data Loss), is better than that of either pure HDD-based or SSD-based disk array. Our prototype implementation of HPDA and performance evaluations show that HPDA significantly outperforms either HDD-based or SSD-based disk array.

ieee international conference on cloud computing technology and science | 2010

SafeVanish: An Improved Data Self-Destruction for Protecting Data Privacy

Lingfang Zeng; Zhan Shi; Shengjie Xu; Dan Feng

In the background of cloud, self-destructing data mainly aims at protecting the data privacy. All the data and its copies will become destructed or unreadable after a user-specified period, without any user intervention. Besides, anyone cannot get the decryption key after timeout, neither the sender nor the receiver. The Washington’s Vanish system is a system for self-destructing data under cloud computing, and it is vulnerable to “hopping attack” and “sniffer attack”. We propose a new scheme in this paper, called Safe Vanish, to prevent hopping attacks by way of extending the length range of the key shares to increase the attack cost substantially, and do some improvement on the Shamir Secret Sharing algorithm implemented in the Original Vanish system. We present an improved approach against sniffing attacks by using the public key cryptosystem to protectt from sniffing operations. In addition, we evaluate analytically the functionality of the proposed Safe Vanish system.

ieee conference on mass storage systems and technologies | 2011

WAFTL: A workload adaptive flash translation layer with data partition

Qingsong Wei; Bozhao Gong; Suraj Pathak; Bharadwaj Veeravalli; Lingfang Zeng; Kanzo Okada

Current FTL schemes have inevitable limitations in terms of memory requirement, performance, garbage collection overhead, and scalability. To overcome these limitations, we propose a workload adaptive flash translation layer referred to as WAFTL. WAFTL explores either page-level or block-level address mapping for normal data block based on access patterns. Page Mapping Block (PMB) is used to store random data and handle large number of partial updates. Block Mapping Block (BMB) is utilized to store sequential data and lower overall mapping table. PMB or BMB is allocated on demand and the number of PMB or BMB eventually depends on workload. An efficient address mapping is designed to reduce overall mapping table and quickly conduct address translation. WAFTL explores a small part of flash space as Buffer Zone to log writes sequentially and migrate data into BMB or PMB based on threshold. Static and dynamic threshold setting are proposed to balance performance and mapping table size. WAFTL has been extensively evaluated under various enterprise workloads. Benchmark results conclusively demonstrate that proposed WAFTL is workload adaptive and achieves up to 80% performance improvement, 83% garbage collection overhead reduction and 50% mapping table reduction compared to existing FTL schemes.

Journal of Parallel and Distributed Computing | 2015

SABA: A security-aware and budget-aware workflow scheduling strategy in clouds

Lingfang Zeng; Bharadwaj Veeravalli; Xiaorong Li

Abstract High quality of security service is increasingly critical for Cloud workflow applications. However, existing scheduling strategies for Cloud systems disregard security requirements of workflow applications and only consider CPU time neglecting other resources like memory, storage capacities. These resource competition could noticeably affect the computation time and monetary cost of both submitted tasks and their required security services. To address this issue, in this paper, we introduce immoveable dataset concept which constrains the movement of certain datasets due to security and cost considerations and propose a new scheduling model in the context of Cloud systems. Based on the concept, we propose a Security-Aware and Budget-Aware workflow scheduling strategy (SABA), which holds an economical distribution of tasks among the available CSPs (Cloud Service Providers) in the market, to provide customers with shorter makespan as well as security services. We conducted extensive simulation studies using six different workflows from real world applications as well as synthetic ones. Results indicate that the scheduling performance is affected by immoveable datasets in Clouds and the proposed scheduling strategy is highly effective under a wide spectrum of workflow applications.

modeling, analysis, and simulation on computer and telecommunication systems | 2008

GRAID: A Green RAID Storage Architecture with Improved Energy Efficiency and Reliability

Bo Mao; Dan Feng; Hong Jiang; Suzhen Wu; Jianxi Chen; Lingfang Zeng

Existing power-aware optimization schemes for disk-array systems tend to strike a delicate balance between energy consumption and performance while ignoring reliability. To achieve a reasonably good trade-off among these three important design objectives in this paper we introduce an energy efficient disk array architecture, called a Green RAID (or GRAID), which extends the data mirroring redundancy of RAID 10 by incorporating a dedicated log disk. The goal of GRAID is to significantly improve energy efficiency or reliability of existing RAID-based systems without noticeably sacrificing their reliability or energy efficiency. The main idea behind GRAID is to update the mirroring disks only periodically while storing all updates since the last mirror-disk update in a log disk, thus being able to spin down all the mirroring disks (or half of the total disks) most of the time to a lower power mode to save energy without sacrificing reliability. Reliability analysis shows that the reliability of GRAID, in terms of MTTDL (Mean Time To Data Loss), is only slightly worse than RAID 10. On the other hand, our prototype implementation of GRAID and performance evaluation show that GRAIDs energy efficiency is significantly better than that of RAID 10 by up to 32.1% and an average of 25.4%.

international conference on parallel and distributed systems | 2009

JOR: A Journal-guided Reconstruction Optimization for RAID-Structured Storage Systems

Suzhen Wu; Dan Feng; Hong Jiang; Bo Mao; Lingfang Zeng; Jianxi Chen

This paper proposes a simple and practical RAID reconstruction optimization scheme, called JOurnal-guided Reconstruction (JOR). JOR exploits the fact that significant portions of data blocks in typical disk arrays are unused. JOR monitors the storage space utilization status at the block level to guide the reconstruction process so that only failed data on the used stripes is recovered to the spare disk. In JOR, data consistency is ensured by the requirement that all blocks in a disk array be initialized to zero (written with value zero) during synchronization while all blocks in the spare disk also be initialized to zero in the background. JOR can be easily incorporated into any existing reconstruction approach to optimize it, because the former is independent of and orthogonal to the latter. Experimental results obtained from our JOR prototype implementation demonstrate that JOR reduces reconstruction times of two state-of-the-art reconstruction schemes by an amount that is approximately proportional to the percentage of unused storage space while ensuring data consistency.

Journal of Network and Computer Applications | 2015

An integrated task computation and data management scheduling strategy for workflow applications in cloud environments

Lingfang Zeng; Bharadwaj Veeravalli; Albert Y. Zomaya

A workflow is a systematic computation or a data-intensive application that has a regular computation and data access patterns. It is a key to design scalable scheduling algorithms in Cloud environments to address these runtime regularities effectively. While existing researches ignore to join the tasks scheduling and the optimization of data management for workflow, little attention has been paid so far to understand the combination between the two. The proposed scheme indicates that the coordination between task computation and data management can improve the scheduling performance. Our model considers data management to obtain satisfactory makespan on multiple datacenters. At the same time, our adaptive data-dependency analysis can reveal parallelization opportunities. In this paper, we introduce an adaptive data-aware scheduling (ADAS) strategy for workflow applications. It consist of a set-up stage which builds the clusters for the workflow tasks and datasets, and a run-time stage which makes the overlapped execution for the workflows. Through rigorous performance evaluation studies, we demonstrate that our strategy can effectively improve the workflow completion time and utilization of resources in a Cloud environment.

ieee conference on mass storage systems and technologies | 2012

HRAID6ML: A hybrid RAID6 storage architecture with mirrored logging

Lingfang Zeng; Dan Feng; Jianxi Chen; Qingsong Wei; Bharadwaj Veeravalli; Wenguo Liu

The RAID6 provides high reliability using double-parity-update at cost of high write penalty. In this paper, we propose HRAID6ML, a new logging architecture for RAID6 systems for enhanced energy efficiency, performance and reliability. HRAID6ML explores a group of Solid State Drives (SSDs) and Hard Disk Drives (HDDs): Two HDDs (parity disks) and several SSDs form RAID6. The free space of the two parity disks is used as mirrored log region of the whole system to absorb writes. The mirrored logging policy helps to recover system from parity disk failure. Mirrored logging operation does not introduce noticeable performance overhead to the whole system. HRAID6ML eliminates the additional hardware and energy costs, potential single point of failure and performance bottleneck. Furthermore, HRAID6ML prolongs the lifecycle of the SSDs and improves the systems energy efficiency by reducing the SSDs write frequency. We have implemented proposed HRAID6ML. Extensive trace-driven evaluations demonstrate the advantages of the HRAID6ML system over both traditional SSD-based RAID6 system and HDD-based RAID6 system.

international conference on machine learning and cybernetics | 2004

SOSS: smart object-based storage system

Lingfang Zeng; Dan Feng; Lingjun Qin

Hints for storage system come from three aspects: the first is in combination with accurate file or directory attribute layout. The second is from existing file system and depends on user input. The third is from file content analysis. However in the SOSS (smart object-based storage system), object, a new fundamental storage component different from traditional storage unit (file or block), is introduced, and object provides ample hints for storage system, which can help with designing more intelligent (or smarter) storage system. This paper gives a brief introduction about SOSS, and some methods adopted in SOSS to achieve intelligence, such as pattern recognition method, predicting object properties method and an adaptive cache replacement policy (all are based on object attribute), are studied.

Explore More