Guangyan Zhang
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guangyan Zhang.
Information Sciences | 2015
Dawei Sun; Guangyan Zhang; Songlin Yang; Weimin Zheng; Samee Ullah Khan; Keqin Li
To achieve high energy efficiency and low response time in big data stream computing environments, it is required to model an energy-efficient resource scheduling and optimization framework. In this paper, we propose a real-time and energy-efficient resource scheduling and optimization framework, termed the Re-Stream. Firstly, the Re-Stream profiles a mathematical relationship among energy consumption, response time, and resource utilization, and obtains the conditions to meet high energy efficiency and low response time. Secondly, a data stream graph is modeled by using the distributed stream computing theories, which identifies the critical path within the data stream graph. Such a methodology aids in calculating the energy consumption of a resource allocation scheme for a data stream graph at a given data stream speed. Thirdly, the Re-Stream allocates tasks by utilizing an energy-efficient heuristic and a critical path scheduling mechanism subject to the architectural requirements. This is done to optimize the scheduling mechanism online by reallocating the critical vertices on the critical path of a data stream graph to minimize the response time and system fluctuations. Moreover, the Re-Stream consolidates the non-critical vertices on the non-critical path so as to improve energy efficiency. We evaluate the Re-Stream to measure energy efficiency and response time for big data stream computing environments. The experimental results demonstrate that the Re-Stream has the ability to improve energy efficiency of a big data stream computing system, and to reduce average response time. The Re-Stream provides an elegant trade-off between increased energy efficiency and decreased response time effectively within big data stream computing environments.
ACM Transactions on Storage | 2007
Guangyan Zhang; Jiwu Shu; Wei Xue; Weimin Zheng
Round-robin striping, due to its uniform distribution and low-complexity computation, is widely used by applications which demand high bandwidth and massive storage. Because many systems are nonstoppable when their storage capacity and I/O bandwidth need increasing, an efficient and online mechanism to add more disks to striped volumes is very important. In this article, it is presented and proved that during data redistribution caused by scaling a round-robin striped volume, there is always a reordering window where data consistency can be maintained while changing the order of data movements. Furthermore, by exploiting the reordering window characteristic, SLAS is proposed to scale round-robin striped volumes, which reduces the cost of data redistribution effectively. First, SLAS applies a new mapping management solution based on a sliding window to support data redistribution without loss of scalability; second, it uses lazy updates of mapping metadata to decrease the number of metadata writes required by data redistribution; third, it changes the order of data chunk movements to aggregate reads/writes of data chunks. Our results from detailed simulations using real-system workloads show that, compared with the traditional approach, SLAS can reduce redistribution duration by up to 40.79% with similar maximum response time of foreground I/Os. Finally, our discussion indicates that the SLAS approach works for both disk addition and disk removal to/from striped volumes.
IEEE Transactions on Computers | 2010
Guangyan Zhang; Weiman Zheng; Jiwu Shu
When a RAID-5 volume is scaled up with added disks, data have to be redistributed from original disks to all disks including the original and the new. Existing online scaling techniques suffer from long redistribution times as well as negative impacts on application performance. By leveraging our insight into a reordering window, this paper presents ALV, a new data redistribution approach to RAID-5 scaling. The reordering window is a result of the natural space hole as data being redistributed, and it grows in size. The data inside the reordering window can migrate in any order without overwriting other in-use data chunks. The ALV approach exploits three novel techniques. First, ALV changes the movement order of data chunks to access multiple successive chunks via a single I/O. Second, ALV updates mapping metadata lazily to minimize the number of metadata writes while ensuring data consistency. Third, ALV uses an on/off logical valve to adaptively adjust the redistribution rate depending on application workload. We implemented ALV in Linux Kernel 2.6.18 and evaluated its performance by replaying three real-system traces: TPC-C, Cello-99, and SPC-Web. The results demonstrated that ALV outperformed the conventional approach consistently by 53.31-73.91 percent in user response time and by 24.07-29.27 percent in redistribution time.
Science in China Series F: Information Sciences | 2013
Guangyan Zhang; Jianping Qiu; Jiwu Shu; Weimin Zheng
Existing data management tools have some limitations such as restrictions to specific file systems or shortage of transparence to applications. In this paper, we present a new data management tool called AIP, which is implemented via the standard data management API, and hence it supports multiple file systems and makes data management operations transparent to applications. First, AIP provides centralized policy-based data management for controlling the placement of files in different storage tiers. Second, AIP uses differentiated collections of file states to improve the execution efficiency of data management policies, with the help of the caching mechanism of file states. Third, AIP also provides a resource arbitration mechanism for controlling the rate of initiated data management operations. Our results from representative experiments demonstrate that AIP has the ability to provide high performance, to introduce low management overhead, and to have good scalability.
ieee international conference on cloud computing technology and science | 2015
Xiaochun Yun; Guangjun Wu; Guangyan Zhang; Keqin Li; Shupeng Wang
Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges. Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environments. In this paper, we propose FastRAQ-a fast approach to range-aggregate queries in big data environments. FastRAQ first divides big data into independent partitions with a balanced partitioning algorithm, and then generates a local estimation sketch for each partition. When a range-aggregate query request arrives, FastRAQ obtains the result directly by summarizing local estimates from all partitions. FastRAQ has O(1) time complexity for data updates and O(N/P×B) time complexity for range-aggregate queries, where N is the number of distinct tuples for all dimensions, P is the partition number, and B is the bucket number in the histogram. We implement the FastRAQ approach on the Linux platform, and evaluate its performance with about 10 billions data records. Experimental results demonstrate that FastRAQ provides range-aggregate query results within a time period two orders of magnitude lower than that of Hive, while the relative error is less than 3 percent within the given confidence interval.
IEEE Transactions on Computers | 2015
Guangyan Zhang; Keqin Li; Jingzhe Wang; Weimin Zheng
Disk additions to an RAID-6 storage system can increase the I/O parallelism and expand the storage capacity simultaneously. To regain load balance among all disks including old and new, RAID-6 scaling requires moving certain data blocks onto newly added disks. Existing approaches to RAID-6 scaling, restricted by preserving a round-robin data distribution, require migrating all the data, which results in an expensive cost for RAID-6 scaling. In this paper, we propose RS6-a new approach to accelerating RDP RAID-6 scaling by reducing disk I/Os and XOR operations. First, RS6 minimizes the number of data blocks to be moved while maintaining a uniform data distribution across all data disks. Second, RS6 piggybacks parity updates during data migration to reduce the cost of maintaining consistent parities. Third, RS6 selects parameters of data migration so as to reduce disk I/Os for parity updates. Our mathematical analysis indicates that RS6 provides uniform data distribution, minimal data migration, and fast data addressing. We also conducted extensive simulation experiments to quantitatively characterize the properties of RS6. The results show that, compared with existing “moving-everything” Round-Robin approaches, RS6 reduces the number of blocks to be moved by 60.0%-88.9%, and saves the migration time by 40.27%-69.88%.
IEEE Transactions on Computers | 2014
Guangyan Zhang; Weimin Zheng; Keqin Li
In RAID-5, data and parity blocks are distributed across all disks in a round-robin fashion. Previous approaches to RAID-5 scaling preserve such round-robin distribution, therefore requiring all the data to be migrated. In this paper, we rethink RAID-5 data layout and propose a new approach to RAID-5 scaling called MiPiL. First, MiPiL minimizes data migration while maintaining a uniform data distribution, not only for regular data but also for parity data. It moves the minimum number of data blocks from old disks to new disks for regaining a uniform data distribution. Second, MiPiL optimizes online data migration with piggyback parity updates and lazy metadata updates. Piggyback parity updates during data migration reduce the numbers of additional XOR computations and disk I/Os. Lazy metadata updates minimize the number of metadata writes without compromising data reliability. We implement MiPiL in Linux Kernel 2.6.32.9, and evaluate its performance by replaying three real-system traces. The results demonstrate that MiPiL consistently outperforms the existing “moving-everything” approach by 74.07-77.57% in redistribution time and by 25.78-70.50% in user response time. The experiments also illustrate that under the WebSearch2 and Financial1 workloads, the performance of the RAID-5 scaled using MiPiL is almost identical to that of the round-robin RAID-5.
IEEE Transactions on Cloud Computing | 2015
Shengmei Luo; Guangyan Zhang; Chengwen Wu; Samee Ullah Khan; Keqin Li
As data progressively grows within data centers, the cloud storage systems continuously facechallenges in saving storage capacity and providing capabilities necessary to move big data within an acceptable time frame. In this paper, we present the Boafft, a cloud storage system with distributed deduplication. The Boafft achieves scalable throughput and capacity usingmultiple data servers to deduplicate data in parallel, with a minimal loss of deduplication ratio. Firstly, the Boafft uses an efficient data routing algorithm based on data similarity that reduces the network overhead by quickly identifying the storage location. Secondly, the Boafft maintains an in-memory similarity indexing in each data server that helps avoid a large number of random disk reads and writes, which in turn accelerates local data deduplication. Thirdly, the Boafft constructs hot fingerprint cache in each data server based on access frequency, so as to improve the data deduplication ratio. Our comparative analysis with EMCs stateful routing algorithm reveals that the Boafft can provide a comparatively high deduplication ratio with a low network bandwidth overhead. Moreover, the Boafft makes better usage of the storage space, with higher read/write bandwidth and good load balance.
IEEE Transactions on Computers | 2007
Guangyan Zhang; Jiwu Shu; Wei Xue; Weimin Zheng
Out-of-band virtualization intrinsically has the potential to provide high performance and good scalability. Unfortunately, existing out-of-band virtualization systems have some limitations, such as restrictions to specific platforms and/or hardware. In this paper, we present a new out-of-band virtualization system, MagicStore, which is not limited to any specific hardware and supports three widely used host platforms: Windows, Solaris, and Linux. First, MagicStore uses the SLAS2 approach to scale round-robin striped volumes efficiently. Second, it survives panics and power failures robustly through a combination of lazy synchronizations, ordered writes, and REDO logging. Third, it also incorporates typical legacy storage quickly by analyzing partition tables and reconstructing logical volumes. Our evaluation results from representative experiments demonstrated that MagicStore has the ability to provide high performance, to introduce low processor overhead, and to have good scalability.
international conference on parallel processing | 2016
Xiaqing Li; Guangyan Zhang; H. Howie Huang; Zhufan Wang; Weimin Zheng
As one of the most important deep learning models, convolutional neural networks (CNNs) have achieved great successes in a number of applications such as image classification, speech recognition and nature language understanding. Training CNNs on large data sets is computationally expensive, leading to a flurry of research and development of open-source parallel implementations on GPUs. However, few studies have been performed to evaluate the performance characteristics of those implementations. In this paper, we conduct a comprehensive comparison of these implementations over a wide range of parameter configurations, investigate potential performance bottlenecks and point out a number of opportunities for further optimization.