Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yusong Tan is active.

Publication


Featured researches published by Yusong Tan.


The Journal of Supercomputing | 2015

Workflow scheduling in cloud: a survey

Fuhui Wu; Qingbo Wu; Yusong Tan

To program in distributed computing environments such as grids and clouds, workflow is adopted as an attractive paradigm for its powerful ability in expressing a wide range of applications, including scientific computing, multi-tier Web, and big data processing applications. With the development of cloud technology and extensive deployment of cloud platform, the problem of workflow scheduling in cloud becomes an important research topic. The challenges of the problem lie in: NP-hard nature of task-resource mapping; diverse QoS requirements; on-demand resource provisioning; performance fluctuation and failure handling; hybrid resource scheduling; data storage and transmission optimization. Consequently, a number of studies, focusing on different aspects, emerged in the literature. In this paper, we firstly conduct taxonomy and comparative review on workflow scheduling algorithms. Then, we make a comprehensive survey of workflow scheduling in cloud environment in a problem–solution manner. Based on the analysis, we also highlight some research directions for future investigation.


APPT 2013 Revised Selected Papers of the 10th International Symposium on Advanced Parallel Processing Technologies - Volume 8299 | 2013

A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture

Fuhui Wu; Qingbo Wu; Yusong Tan; Lifeng Wei; Lisong Shao; Long Gao

The K-Means algorithms is one of the most popular and effective clustering algorithms for many practical applications. However, direct K-Means methods, taking objects as processing unit, is computationally expensive especially in Objects-Assignment phase on Single-Instruction Single-Data SISD processors, typically as CPUs. In this paper, we propose a vectorized K-Means algorithm for Intel Many Integrated Core MIC coprocessor, a newly released product from Intel for highly parallel workloads. This new algorithm is able to achieve fine-grained Single-Instruction Multiple-Data SIMD parallelism by taking each dimension of all objects as a long vector. This vectorized algorithm is suitable for any-dimensional objects, which is little taken into consideration in preceding works. We also parallelize the vectorized K-Means algorithm on MIC coprocessor to achieve coarse-grained thread-level parallelism. Finally, we implement and evaluate the vectorized method on the first generation of Intel MIC product. Measurements show that this algorithm based on MIC coprocessor gets desired speedup to sequential algorithm on CPU and demonstrate that MIC coprocessor owns highly parallel computational power as well as scalability.


international conference on algorithms and architectures for parallel processing | 2015

Maximize Throughput Scheduling and Cost-Fairness Optimization for Multiple DAGs with Deadline Constraint

Wei Wang; Qingbo Wu; Yusong Tan; Fuhui Wu

More and more application workflows are computed in cloud and most of them can be expressed by Directed Acyclic Graph DAG. As Cloud resource providers, they should guarantee as many as possible DAGs be accomplished within their deadline when they face the overstep request of computer resource. In this paper, we define the urgency of DAG and introduce the MTMD Maximize Throughput of Multi-DAG with Deadline algorithm to improve the ratio of DAGs which can be accomplished within deadline. The urgency of DAG is changing among execution and determine the execution order of tasks. We can detect DAGs which will exceed the deadline by this algorithm and abandon these DAGs timely. Based on the MTMD algorithm, we put forward the CFS Cost Fairness Scheduling algorithm to reduce the unfairness of cost between different DAGs. The simulation results show that the MTMD algorithm outperforms three other algorithms and the CFS algorithm reduces the cost of all DAGs by 12.1i?ź% on average and reduces the unfairness among DAGs by 54.5i?ź% on average.


international conference on algorithms and architectures for parallel processing | 2015

Unified Multi-constraint and Multi-objective Workflow Scheduling for Cloud System

Fuhui Wu; Qingbo Wu; Yusong Tan; Wei Wang

With the development of cloud computing, the problem of scheduling workflow in cloud system attracts a large amount of attention. In general, the cloud workflow scheduling problem requires to consider a variety of optimization objectives with some constraints. Traditional workflow scheduling methods focus on single optimization goal like makespan and single constraint like deadline or budget. In this paper, we first make a unified formalization of the optimality problem of multi-constraint and multi-objective cloud workflow scheduling using pareto optimality theory. We also present a two-constraint and two-objective case study, considering deadline, budget constraints and energy consumption, reliability objectives. A general list scheduling algorithm and a tuning mechanism are designed to solve this problem. Through extensive experimental, it confirms the efficiency of the unified multi-constraint and multi-objective cloud workflow scheduling system.


international conference on cloud computing | 2016

TMVCE—topology-aware multipath Virtual Cluster embedding algorithm

Rongzhen Li; Jianfeng Zhang; Yusong Tan; Qingbo Wu

Virtual Cluster refers to the basics of providing distributed parallel system for tenants by sharing resources in cloud data center. Allocating physical resource for virtual cluster is known as virtual cluster embedding problem, which is a critical issue that affects the performance of virtual cluster and resource utilization of the system. In order to effectively reduce the runtime of virtual cluster embedding and improve the revenue/cost ratio, this paper proposes a topology-aware multipath virtual cluster embedding algorithm (TMVCE). Virtual cluster topology information, mainly including the degree and closeness two factors of topology network, is used efficiently for the measurement parameters of VCE. According to extensive experimental tests and the comparison of correlative algorithms, it is obvious that TMVCE obtains a higher embedding efficiency and, to some extent, improves the revenue/cost ratio.


The Journal of Supercomputing | 2016

Resource stealing: a resource multiplexing method for mix workloads in cloud system

Yusong Tan; Fuhui Wu; Qingbo Wu; Xiangke Liao

The cloud computing paradigm enables providing resources on demand. However, most of them focus on a single type of application requiring separate quality of service. In the context that mix heterogeneous workloads are co-scheduled in the cloud, resource multiplexing is the key to improve resource utilization under premise of performance guaranteing. In this paper, we propose a resource stealing mechanism to improve resource multiplexing of cloud resources. It enables free resource fragments reserved by some workloads being utilized by others. To meet certain service level agreement, resource preemption is adopted as a complement to resource stealing. It ensures each workload with a minimum amount of resources when required. Moreover, we propose an adaptive joint resource provisioning algorithm. It integrates our resource multiplexing method into elastic resource provisioning. Experimental results reveal that the proposed algorithms improve resource utilization and workload performance simultaneously.


Journal of Parallel and Distributed Computing | 2016

micMR: An efficient MapReduce framework for CPU–MIC heterogeneous architecture

Wenzhu Wang; Yusong Tan; Qingbo Wu; Yaoxue Zhang

Abstract With the high-speed development of processors, coprocessor-based MapReduce is widely studied. In this paper, we propose micMR, an efficient MapReduce framework for CPU–MIC heterogeneous architecture. micMR mainly provides the following new features. First, the two-level split and the SIMD friendly map are designed for utilizing the Vector Process Units on MIC. Second, heterogeneous pipelined reduce is developed for improving the efficiency of resource utilization. Third, a memory management scheme is designed for accessing key, value > pairs in both the host and the MIC memory efficiently. In addition, optimization techniques, including load balancing, SIMD hash, and asynchronous task transfer, are designed for achieving more speedups. We have developed micMR not only in a single node with CPU and MIC but also in a CPU–MIC heterogeneous cluster. The experimental results show that micMR is up to 8.4x and 45.8x faster than Phoenix++, a high-performance MapReduce system for symmetric multiprocessing system, and up to 2.0x and 5.1x faster than Hadoop in a CPU–MIC cluster.


international symposium on distributed computing | 2015

Analysis Range of Coefficients in Learning Rate Methods of Convolution Neural Network

Jiang Zou; Qingbo Wu; Yusong Tan; Fuhui Wu; Wenzhu Wang

Convolutional Neural Network (CNN) is a type of feed-forward artificial neural network, exploiting the unknown structure in input distribution to discover good representations with multiple layers of small neuron collections. CNN uses relatively little pre-processing compared to other classification algorithms, usually uses gradient decent to updates the parameters in the network. Since CNN was introduced in 1997s to deal with face recognition, it has made much achievement in many fields, and has been the state-of-the-art method in face recognition, speech recognition, etc. To get small error rate and a better speed of training the CNNs, a lot of learning rate methods are proposed. In this paper, we analyzed the range of the coefficients in these methods with a restriction of max convergence constant step learning rate. In our experiments, we find the max convergence learning rate by dichotomy with little computation cost, we also confirm the range of coefficients is useful. Moreover, we gives a comparison among these methods on speed and error rate.


international conference on intelligent science and big data engineering | 2015

An Efficient MapReduce Framework for Intel MIC Cluster

Wenzhu Wang; Qingbo Wu; Yusong Tan; Yaoxue Zhang

MapReduce is a distributed programming framework to process large scale data set by employing clusters in scale-out ways. However, scaling-up the single node is better than scale-out solution because of less communication overhead. As Intel MIC has a higher performance than ordinary CPU, we propose an efficient MapReduce framework for Intel MIC cluster. Our framework provides several new features, such as fault tolerant mechanism for MIC management, efficient buffer management in MIC memory, and asynchronous task transfer between CPU and MIC. It could manage a large scale MIC cluster and exploit applications in MapReduce like ways. The experimental results show that our system is upi¾źto 1.35x and 6.8x faster than Hadoop on ordinary CPU cluster.


international conference on computer science and network technology | 2015

Multiple resources scheduling for diverse workloads in heterogeneous datacenter

Wenzhu Wang; Yusong Tan; Qingbo Wu; Yaoxue Zhang

Multiple resources allocation in large scale datacenter has been widely studied and used in recent years. However, the existing scheduling methods have some shortcomings in the heterogeneous environment, such as starving certain jobs, leaving some important resources unallocated, and decreasing system throughput. In this paper, we generalize the most popular multi-resource scheduling algorithm, Dominant Resource Fairness (DRF), to the heterogeneous datacenter, and propose the Heterogeneous DRF (H-DRF) algorithm. In H-DRF, we first introduce the parameter of weight for coprocessors to avoid job starvation and important resources unallocated. Second, we consider the factors of running tasks and average executing time of each job to ensure the fairness and throughput of the system. Finally, we implement H-DRF algorithm in the YARN resource manager for scheduling diverse workloads. The experimental results show that our proposed algorithm, H-DRF, leads to higher resources utilization and better throughput than DRF sharing scheme in heterogeneous datacenter.

Collaboration


Dive into the Yusong Tan's collaboration.

Top Co-Authors

Avatar

Qingbo Wu

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Fuhui Wu

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Rongzhen Li

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Jianfeng Zhang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Wenzhu Wang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Wei Wang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaoli Sun

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Jie Lin

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Jing Wang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaoling Li

National University of Defense Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge