Is this you? Create Your Porfile

Weidong Cai

Nanjing University of Information Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Weidong Cai is active.

Explore More

Publication

Featured researches published by Weidong Cai.

Security and Communication Networks | 2016

A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment

Qi Liu; Weidong Cai; Jian Shen; Zhangjie Fu; Xiaodong Liu; Nigel Linge

A heterogeneous cloud system, for example, a Hadoop 2.6.0 platform, provides distributed but cohesive services with rich features on large-scale management, reliability, and error tolerance. As big data processing is concerned, newly built cloud clusters meet the challenges of performance optimization focusing on faster task execution and more efficient usage of computing resources. Presently proposed approaches concentrate on temporal improvement, that is, shortening MapReduce time, but seldom focus on storage occupation; however, unbalanced cloud storage strategies could exhaust those nodes with heavy MapReduce cycles and further challenge the security and stability of the entire cluster. In this paper, an adaptive method is presented aiming at spatial-temporal efficiency in a heterogeneous cloud environment. A prediction model based on an optimized Kernel-based Extreme Learning Machine algorithm is proposed for faster forecast of job execution duration and space occupation, which consequently facilitates the process of task scheduling through a multi-objective algorithm called time and space optimized NSGA-II TS-NSGA-II. Experiment results have shown that compared with the original load-balancing scheme, our approach can save approximate 47-55i¾źs averagely on each task execution. Simultaneously, 1.254i¾ź of differences on hard disk occupation were made among all scheduled reducers, which achieves 26.6% improvement over the original scheme. Copyright

IEEE Transactions on Consumer Electronics | 2016

An adaptive approach to better load balancing in a consumer-centric cloud environment

Qi Liu; Weidong Cai; Jian Shen; Xiaodong Liu; Nigel Linge

Pay-as-you-consume, as a new type of cloud computing paradigm, has become increasingly popular since a large number of cloud services are gradually opening up to consumers. It gives consumers a great convenience, where users no longer need to buy their hardware resources, but are confronted with how to deal effectively with data from the cloud. How to improve the performance of the cloud platform as a consumer-centric cloud computing model becomes a critical issue. Existing heterogeneous distributed computing systems provide efficient parallel and high fault tolerant and reliable services, due to its characteristics of managing largescale clusters. Though the latest cloud computing cluster meets the need for faster job execution, more effective use of computing resources is still a challenge. Presently proposed methods concentrated on improving the execution time of incoming jobs, e.g., shortening the MapReduce (MR) time. In this paper, an adaptive scheme is offered to achieve time and space efficiency in a heterogeneous cloud environment. A dynamic speculative execution strategy on real-time management of cluster resources is presented to optimize the execution time of Map phase, and a prediction model is used for fast prediction of task execution time. Combing the prediction model with a multi-objective optimization algorithm, an adaptive solution to optimize the performance of space-time is obtained. Experimental results depict that the proposed scheme can allocate tasks evenly and improve work efficiency in a heterogeneous cluster1.

international conference on advanced communication technology | 2017

A speculative execution strategy based on node classification and hierarchy index mechanism for heterogeneous Hadoop systems

Qi Liu; Weidong Cai; Jian Shen; Zhangjie Fu; Xiaodong Liu; Nigel Linge

MapReduce (MR) has been widely used to process distributed large data sets. MRV2 working on Yarn, as a more advanced programing model, has gained lots of concerns. Meanwhile, speculative execution is known as an approach for dealing with same problems by backing up those tasks running on a low performance machine to a higher one. In this paper, we have modified some pitfalls and taken heterogeneous environment into consideration. Besides, Node classification is used and a novel hierarchy index mechanism is created. We also have implemented it in Hadoop-2.6 and the strategy above is called Speculation-NC while optimized Hadoop is called Hadoop-NC. Experiment results show that our method can correctly backup a task, improve the performance of MRV2 and decrease the execution time and resource consumption compared with traditional strategies.

international conference on consumer electronics | 2016

A load-balancing approach based on modified K-ELM and NSGA-II in a heterogeneous cloud environment

Qi Liu; Weidong Cai; Jian Shen; Dandan Jin; Nigel Linge

MapReduce is a popular programming model widely used in distributed systems. With regard to large-scale applications, e.g. home energy management in a city, online social community etc., load-balancing becomes critical affecting the performance of distributed computing. Present proposed load-balancing approaches in MapReduce aim at optimizing task execution time, whereas disk space is not considered. In this paper, a new scheme which consists of modified K-ELM and NSGA-II is proposed. Corresponding experiment results have shown that our method can assign tasks evenly, and effectively improve the performance of a cloud system.

international conference on advanced communication technology | 2016

A smart speculative execution strategy based on node classification for heterogeneous Hadoop systems

Qi Liu; Weidong Cai; Jian Shen; Zhangjie Fu; Nigel Linge

MapReduce (MR) has been widely used to process distributed large data sets. Meanwhile, speculative execution is known as an approach for dealing with same problems by backing up those tasks running on a low performance machine to a higher one. In this paper, we have modified some pitfalls and taken heterogeneous environment into consideration. We also have implemented it in Hadoop-2.6 based on node classification, this strategy is called Speculation-NC and optimized Hadoop is called Hadoop-NC. Experiment results show that our method can correctly backup a task, improve the performance of MRV2 and decrease the execution time and resource consumption compared with traditional strategy.

Sensors | 2016

Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment

Qi Liu; Weidong Cai; Dandan Jin; Jian Shen; Zhangjie Fu; Xiaodong Liu; Nigel Linge

Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks’ execution time can be improved, in particular for some regular jobs.

international conference on future generation communication and networking | 2015

An Optimized Strategy for Speculative Execution in a Heterogeneous Environment

Qi Liu; Weidong Cai; Zhangjie Fu; Jian Shen; Nigel Linge

MapReduce is a popular programming model for the purposes of processing large data sets. Speculative execution known as an approach for dealing with the above problems works by backing up those tasks running on a low performance machine to a higher one. In this paper, we have modified some pitfalls and taken computer hardware into consideration (HWC-Speculation). We also have implemented it in Hadoop-2.6 and experiment results show that our method can assign tasks evenly, improve the performance of MRV2 and decrease the execution time.

international conference on advanced cloud and big data | 2015

VPCH: A Consistent Hashing Algorithm for Better Load Balancing in a Hadoop Environment

Qi Liu; Weidong Cai; Jian Shen; Baowei Wang; Zhangjie Fu; Nigel Linge

MapReduce (MR) is a popular programming model for the purposes of processing large data sets among data clusters or grids, e.g. a Hadoop environment. Load balancing as a key factor affecting the performance of map resource distribution, has recently gained high concerns to optimize. Current MR processes in the realization of distributing tasks to clusters use hashing with random modulo operations, which can lead to uneven data distribution and inclined loads, thereby obstruct the performance of the entire distribution system. In this paper, a virtual partition consistent hashing (VPCH) algorithm is proposed for the reduce stage of MR processes, in order to achieve such a trade-off on job allocation. According to the results, using our method can reduce task execution time with or without MJR (mapreduce.job.reduce.slowstart.completedmaps) parameter set.

International Journal of Grid and Distributed Computing | 2016

An Optimization Scheme in MapReduce for Reduce Stage

Qi Liu; Weidong Cai; Baowei Wang; Zhangjie Fu; Nigel Linge

As a widely used programming model for the purposes of processing large data sets, MapReduce (MR) becomes inevitable in data clusters or grids, e.g. a Hadoop environment. Load balancing as a key factor affecting the performance of map resource distribution, has recently gained high concerns to optimize. Current MR processes in the realization of distributed tasks to clusters use hashing with random modulo operations, which can lead to uneven data distribution and inclined loads, thereby obstruct the performance of the entire distribution system. In this paper, a virtual partition consistent hashing (VPCH) algorithm is proposed for the reduce stage of MR processes, in order to achieve such a trade-off on job allocation. Besides, experienced programmers are needed to decide the number of reducers used during the reduce phase of the MR, which makes the quality of MR scripts differ. So, an extreme learning method is employed to recommend potential number of reducer a mapped task needs. Execution time is also predicted for user to better arrange their tasks. According to the results, VPCH can lead to load balancing and our prediction model can provide fast prediction than SVM with similar accuracy maintained.

international conference on cloud computing | 2015

An Extreme Learning Approach to Fast Prediction in the Reduce Phase of a Cloud Platform

Qi Liu; Weidong Cai; Jian Shen; Baowei Wang; Zhangjie Fu; Nigel Linge

As a widely used programming model for the purposes of processing large data sets, MapReduce (MR) becomes inevitable in data clusters or grids, e.g. a Hadoop environment. However, experienced programmers are needed to decide the number of reducers used during the reduce phase of the MR, which makes the quality of MR scripts differ. In this paper, an extreme learning method is employed to recommend potential number of reducer a mapped task needs. Execution time is also predicted for user to better arrange their tasks. According to the results, our method can provide fast prediction than SVM with similar accuracy maintained.

Explore More