Yongwei Wu
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yongwei Wu.
IEEE Transactions on Parallel and Distributed Systems | 2010
Yongwei Wu; Kai Hwang; Yulai Yuan; Weiming Zheng
Predicting grid performance is a complex task because heterogeneous resource nodes are involved in a distributed environment. Long execution workload on a grid is even harder to predict due to heavy load fluctuations. In this paper, we use Kalman filter to minimize the prediction errors. We apply Savitzky-Golay filter to train a sequence of confidence windows. The purpose is to smooth the prediction process from being disturbed by load fluctuations. We present a new adaptive hybrid method (AHModel) for load prediction guided by trained confidence windows. We test the effectiveness of this new prediction scheme with real-life workload traces on the AuverGrid and Grid5000 in France. Both theoretical and experimental results are reported in this paper. As the lookahead span increases from 10 to 50 steps (5 minutes per step), the AHModel predicts the grid workload with a mean-square error (MSE) of 0.04-0.73 percent, compared with 2.54-30.2 percent in using the static point value autoregression (AR) prediction method. The significant gain in prediction accuracy makes the new model very attractive to predict Grid performance. The model was proved especially effective to predict large workload that demands very long execution time, such as exceeding 4 hours on the Grid5000 over 5,000 processors. With minor changes of some system parameters, the AHModel can apply to other computational grids as well. At the end, we discuss extended research issues and tool development for Grid performance prediction.
IEEE Transactions on Parallel and Distributed Systems | 2013
Fengyuan Ren; Jiao Zhang; Yongwei Wu; Tao He; Canfeng Chen; Chuang Lin
The resources especially energy in wireless sensor networks (WSNs) are quite limited. Since sensor nodes are usually much dense, data sampled by sensor nodes have much redundancy, data aggregation becomes an effective method to eliminate redundancy, minimize the number of transmission, and then to save energy. Many applications can be deployed in WSNs and various sensors are embedded in nodes, the packets generated by heterogenous sensors or different applications have different attributes. The packets from different applications cannot be aggregated. Otherwise, most data aggregation schemes employ static routing protocols, which cannot dynamically or intentionally forward packets according to network state or packet types. The spatial isolation caused by static routing protocol is unfavorable to data aggregation. To make data aggregation more efficient, in this paper, we introduce the concept of packet attribute, defined as the identifier of the data sampled by different kinds of sensors or applications, and then propose an attribute-aware data aggregation (ADA) scheme consisting of a packet-driven timing algorithm and a special dynamic routing protocol. Inspired by the concept of potential in physics and pheromone in ant colony, a potential-based dynamic routing is elaborated to support an ADA strategy. The performance evaluation results in series of scenarios verify that the ADA scheme can make the packets with the same attribute spatially convergent as much as possible and therefore improve the efficiency of data aggregation. Furthermore, the ADA scheme also offers other properties, such as scalable with respect to network size and adaptable for tracking mobile events.
grid computing | 2007
Yongwei Wu; Yulai Yuan; Guangwen Yang; Weimin Zheng
Due to the dynamic nature of grid environments, schedule algorithms always need assistance of a long-time-ahead load prediction to make decisions on how to use grid resources efficiently. In this paper, we present and evaluate a new hybrid model, which predicts the n-step-ahead load status by using interval values. This model integrates autoregressive (AR) model with confidence interval estimations to forecast the future load of a system. Meanwhile, two filtering technologies from signal processing field are also introduced into this model to eliminate data noise and enhance prediction accuracy. The results of experiments conducted on a real grid environment demonstrate that this new model is more capable of predicting n-step-ahead load in a computational grid than previous works. The proposed hybrid model performs well on prediction advance time for up to 50 minutes, with significant less prediction errors than conventional AR model. It also achieves an interval length acceptable for task scheduler.
international symposium on microarchitecture | 2015
Jinglei Ren; Jishen Zhao; Samira Manabi Khan; Jongmoo Choi; Yongwei Wu; Onur Mutiu
Emerging byte-addressable nonvolatile memories (NVMs) promise persistent memory, which allows processors to directly access persistent data in main memory. Yet, persistent memory systems need to guarantee a consistent memory state in the event of power loss or a system crash (i.e., crash consistency). To guarantee crash consistency, most prior works rely on programmers to (1) partition persistent and transient memory data and (2) use specialized software interfaces when updating persistent memory data. As a result, taking advantage of persistent memory requires significant programmer effort, e.g., to implement new programs as well as modify legacy programs. Use cases and adoption of persistent memory can therefore be largely limited. In this paper, we propose a hardware-assisted DRAM+NVM hybrid persistent memory design, Transparent Hybrid NVM (ThyNVM), which supports software-transparent crash consistency of memory data in a hybrid memory system. To efficiently enforce crash consistency, we design a new dual-scheme checkpointing mechanism, which efficiently overlaps checkpointing time with application execution time. The key novelty is to enable checkpointing of data at multiple granularities, cache block or page granularity, in a coordinated manner. This design is based on our insight that there is a tradeoff between the application stall time due to checkpointing and the hardware storage overhead of the metadata for checkpointing, both of which are dictated by the granularity of checkpointed data. To get the best of the tradeoff, our technique adapts the checkpointing granularity to the write locality characteristics of the data and coordinates the management of multiple-granularity updates. Our evaluation across a variety of applications shows that ThyNVM performs within 4.9% of an idealized DRAM-only system that can provide crash consistency at no cost.
advanced parallel programming technologies | 2005
Yongwei Wu; Song Wu; Huashan Yu; Chunming Hu
ChinaGrid Support Platform (CGSP) is proposed to provide grid toolkit for ChinaGrid application developers and specific grid constructors, in order to reduce their development cost as greatly as possible. CGSP extensible and reconfigurable framework, which satisfies the expansion and autonomy requirement of ChinaGrid, is mainly discussed in the paper. In the framework, domain is presented to denote one unit which could provide grid service for end users by itself. Layered structure of domains and corresponding interactive relationship are paid much more attention. CGSP design motivation and simple execution management mechanism are also described in this paper.
IEEE Transactions on Parallel and Distributed Systems | 2016
Kai Hwang; Xiaoying Bai; Yue Shi; Muyang Li; Wenguang Chen; Yongwei Wu
In this paper, we present generic cloud performance models for evaluating Iaas, PaaS, SaaS, and mashup or hybrid clouds. We test clouds with real-life benchmark programs and propose some new performance metrics. Our benchmark experiments are conducted mainly on IaaS cloud platforms over scale-out and scale-up workloads. Cloud benchmarking results are analyzed with the efficiency, elasticity, QoS, productivity, and scalability of cloud performance. Five cloud benchmarks were tested on Amazon IaaS EC2 cloud: namely YCSB, CloudSuite, HiBench, BenchClouds, and TPC-W. To satisfy production services, the choice of scale-up or scale-out solutions should be made primarily by the workload patterns and resources utilization rates required. Scaling-out machine instances have much lower overhead than those experienced in scale-up experiments. However, scaling up is found more cost-effective in sustaining heavier workload. The cloud productivity is greatly attributed to system elasticity, efficiency, QoS and scalability. We find that auto-scaling is easy to implement but tends to over provision the resources. Lower resource utilization rate may result from auto-scaling, compared with using scale-out or scale-up strategies. We also demonstrate that the proposed cloud performance models are applicable to evaluate PaaS, SaaS and hybrid clouds as well.
IEEE Transactions on Parallel and Distributed Systems | 2014
Xun Zhao; Yang Zhang; Yongwei Wu; Kang Chen; Jinlei Jiang; Keqin Li
A virtual machine (VM) has been serving as a crucial component in cloud computing with its rich set of convenient features. The high overhead of a VM has been well addressed by hardware support such as Intel virtualization technology (VT), and by improvement in recent hypervisor implementation such as Xen, KVM, etc. However, the high demand on VM image storage remains a challenging problem. Existing systems have made efforts to reduce VM image storage consumption by means of deduplication within a storage area network (SAN) cluster. Nevertheless, an SAN cannot satisfy the increasing demand of large-scale VM hosting for cloud computing because of its cost limitation. In this paper, we propose Liquid, a scalable deduplication file system that has been particularly designed for large-scale VM deployment. Its design provides fast VM deployment with peer-to-peer (P2P) data transfer and low storage consumption by means of deduplication on VM images. It also provides a comprehensive set of storage features including instant cloning for VM images, on-demand fetching through a network, and caching with local disks by copy-on-read techniques. Experiments show that Liquids features perform well and introduce minor performance overhead.
international conference on parallel processing | 2011
Yifeng Geng; Shimin Chen; Yongwei Wu; Ryan Wu; Guangwen Yang; Weimin Zheng
MapReduce is an important programming model for processing and generating large data sets in parallel. It is commonly applied in applications such as web indexing, data mining, machine learning, etc. As an open-source implementation of MapReduce, Hadoop is now widely used in industry. Virtualization, which is easy to configure and economical to use, shows great potential for cloud computing. With the increasing core number in a CPU and involving of virtualization technique, one physical machine can hosts more and more virtual machines, but I/O devices normally do not increase so rapidly. As MapReduce system is often used to running I/O intensive applications, decreasing of data redundancy and load unbalance, which increase I/O interference in virtual cloud, come to be serious problems. This paper builds a model and defines metrics to analyze the data allocation problem in virtual environment theoretically. And we design a location-aware file block allocation strategy that retains compatibility with the native Hadoop. Our model simulation and experiment in real system shows our new strategy can achieve better data redundancy and load balance to reduce I/O interference. Execution time of applications such as RandomWriter, Text Sort and Word Count are reduced by up to 33% and 10% on average.
grid and cooperative computing | 2007
Yulai Yuan; Yongwei Wu; Guangwen Yang; Feng Yu
Efficient data access is one way of improving the performance of the data grid. In order to speed up the data access and reduce bandwidth consumption, data grid replicates essential data in multiple locations. This paper studies data replication strategy in data grid, taking into account two important issues which bound the replication: storage capability of different nodes and the bandwidth between these nodes. We propose a new dynamic replication strategy based on the principle of local optimization. The data grid can achieve the global data access optimization through the interaction of the local optimization in the local optimization areas.
Journal of Grid Computing | 2004
Guangwen Yang; Hai Jin; Minglu Li; Nong Xiao; Wei Li; Zhaohui Wu; Yongwei Wu; Feilong Tang
Abstract Grid computing presents a new trend to distributed computation and Internet applications, which can construct a virtual single image of heterogeneous resources, provide uniform application interface and integrate widespread computational resources into super, ubiquitous and transparent aggregation. In the adoption of Grid computing, China, who is facing more resource heterogeneity and other specific demands, has put much effort on both research and practical utilization. In this paper, we introduce the major China Grid research projects and their perspective applications. First we give the overview of the four government-sponsored programs in Grid, namely the China National Grid, ChinaGrid, NSFC Grid, and ShanghaiGrid. Then we present six representative ongoing Grid systems in details, which are categorized into Grid middleware and Grid application. This paper provides the general picture of Grid computing in China, and shows the great efforts, devotion and confidence in China to use Grid technology to boost the society, economics and scientific research.