Maohua Lu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maohua Lu is active.

Explore More

Publication

Featured researches published by Maohua Lu.

data compression, communications and processing | 2011

Quick Estimation of Data Compression and De-duplication for Large Storage Systems

Cornel Constantinescu; Maohua Lu

Many new storage systems provide some form of data reduction. In a recent paper we investigate how compression and de-duplication can be mixed in primary storage systems serving active data. In this paper we try to answer the question someone would ask before upgrading to a new, data reduction enabled storage server: how much storage savings the new system would offer for the data I have stored right now? We investigate methods to quickly estimate the storage savings potential of customary data reduction methods used in storage systems: compression and full file de-duplication on large scale storage systems. We show that the compression ratio achievable on a large storage system can be precisely estimated with just couple percents (worst case) of the work required to compress each file in the system. Also, we show that full file duplicates can be discovered very quickly with only 4% error (worst case) by a robust heuristic.

acm international conference on systems and storage | 2012

Insights for data reduction in primary storage: a practical analysis

Maohua Lu; David D. Chambliss; Joseph S. Glider; Cornel Constantinescu

There has been increasing interest in deploying data reduction techniques in primary storage systems. This paper analyzes large datasets in four typical enterprise data environments to find patterns that can suggest good design choices for such systems. The overall data reduction opportunity is evaluated for deduplication and compression, separately and combined, then in-depth analysis is presented focusing on frequency, clustering and other patterns in the collected data. The results suggest ways to enhance performance and reduce resource requirements and system cost while maintaining data reduction effectiveness. These techniques include deciding which files to compress based on file type and size, using duplication affinity to guide deployment decisions, and optimizing the detection and mapping of duplicate content adaptively when large segments account for most of the opportunity.

acm international conference on systems and storage | 2013

A scalable deduplication and garbage collection engine for incremental backup

Dilip Nijagal Simha; Maohua Lu; Tzi-cker Chiueh

Very large block-level data backup systems need scalable data deduplication and garbage collection techniques to make efficient use of the storage space and to minimize the performance overhead of doing so. Although the deduplication and garbage collection logic is conceptually straight-forward, their implementations pose a significant technical challenge because only a small portion of their associated data structures could fit into memory. In this paper, we describe the design, implementation and evaluation of a data deduplication and garbage collection engine called Sungem that is designed to remove duplicate blocks in incremental data backup streams. Sungem features three novel techniques to maximize the deduplication throughput without compromising the deduplication ratio. First, Sungem puts related fingerprint sequences, rather than fingerprints from the same backup stream, into the same container in order to increase the fingerprint prefetching efficiency. Second, to make the most of the memory space reserved for storing fingerprints, Sungem varies the sampling rates for fingerprint sequences based on their stability. Third, Sungem combines reference count and expiration time in a unique way to arrive at the first known incremental garbage collection algorithm whose bookkeeping overhead is proportional to the size of a disk volumes incremental backup snapshot rather than its full backup snapshot. We evaluated the Sungem prototype using a real-world data backup trace, and showed that the average throughput of Sungem is more than 200,000 fingerprint lookups per second on a standard X86 server, including the garbage collection cost.

ieee conference on mass storage systems and technologies | 2014

DedupT: Deduplication for tape systems

Abdullah Gharaibeh; Cornel Constantinescu; Maohua Lu; Ramani R. Routray; Anurag Sharma; Prasenjit Sarkar; David Pease; Matei Ripeanu

Deduplication is a commonly-used technique on disk-based storage pools. However, deduplication has not been used for tape-based pools: tape characteristics, such as high mount and seek times combined with data fragmentation resulting from deduplication create a toxic combination that leads to unacceptably high retrieval times. This work proposes DedupT, a system that efficiently supports deduplication on tape pools. This paper (i) details the main challenges to enable efficient deduplication on tape libraries, (ii) presents a class of solutions based on graph-modeling of similarity between data items that enables efficient placement on tapes; and (iii) presents the design and evaluation of novel cross-tape and on-tape chunk placement algorithms that alleviate tape mount time overhead and reduce on-tape data fragmentation. Using 4.5 TB of real-world workloads, we show that DedupT retains at least 95% of the deduplication efficiency. We show that DedupT mitigates major retrieval time overheads, and, due to reading less data, is able to offer better restore performance compared to the case of restoring non-deduplicated data.

Algorithms | 2012

Content Sharing Graphs for Deduplication-Enabled Storage Systems

Maohua Lu; Cornel Constantinescu; Prasenjit Sarkar

Deduplication in storage systems has gained momentum recently for its capability in reducing data footprint. However, deduplication introduces challenges to storage management as storage objects (e.g., files) are no longer independent from each other due to content sharing between these storage objects. In this paper, we present a graph-based framework to address the challenges of storage management due to deduplication. Specifically, we model content sharing among storage objects by content sharing graphs (CSG), and apply graph-based algorithms to two real-world storage management use cases for deduplication-enabled storage systems. First, a quasi-linear algorithm was developed to partition deduplication domains with a minimal amount of deduplication loss (i.e., data replicated across partitioned domains) in commercial deduplication-enabled storage systems, whereas in general the partitioning problem is NP-complete. For a real-world trace of 3 TB data with 978 GB of removable duplicates, the proposed algorithm can partition the data into 15 balanced partitions with only 54 GB of deduplication loss, that is, a 5% deduplication loss. Second, a quick and accurate method to query the deduplicated size for a subset of objects in deduplicated storage systems was developed. For the same trace of 3 TB data, the optimized graph-based algorithm can complete the query in 2.6 s, which is less than 1% of that of the traditional algorithm based on the deduplication metadata.

cluster computing and the grid | 2012

Speculative Memory State Transfer for Active-Active Fault Tolerance

Maohua Lu; Tzi-cker Chiueh

Virtualization provides the possibility of whole machine migration and thus enables a new form of fault tolerance that is completely transparent to applications and operating systems. The most seamless virtualization-based fault tolerance configuration is an active/active master-slave configuration, in which the memory states of the master and slave virtual machine are periodically synchronized and the slave can immediately take over when the master dies without losing any on-going connections. The frequency of memory state synchronization has a direct impact on the performance overhead, the application response time, and the fail-over delay. This paper describes a speculative memory state synchronization technique that could effectively reduce the synchronization frequency without increasing the performance overhead, and presents a comprehensive performance study of these techniques under three realistic workloads, the TPC-E benchmark, the SPECsfs 2008 CIFS benchmark, and a Microsoft Exchange workload. We show that the proposed technique can effectively cut down the amount of memory state synchronization traffic by more than an order of magnitude.

Archive | 2010