Is this you? Create Your Porfile

Fuhui Wu

National University of Defense Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fuhui Wu is active.

Explore More

Publication

Featured researches published by Fuhui Wu.

The Journal of Supercomputing | 2015

Workflow scheduling in cloud: a survey

Fuhui Wu; Qingbo Wu; Yusong Tan

To program in distributed computing environments such as grids and clouds, workflow is adopted as an attractive paradigm for its powerful ability in expressing a wide range of applications, including scientific computing, multi-tier Web, and big data processing applications. With the development of cloud technology and extensive deployment of cloud platform, the problem of workflow scheduling in cloud becomes an important research topic. The challenges of the problem lie in: NP-hard nature of task-resource mapping; diverse QoS requirements; on-demand resource provisioning; performance fluctuation and failure handling; hybrid resource scheduling; data storage and transmission optimization. Consequently, a number of studies, focusing on different aspects, emerged in the literature. In this paper, we firstly conduct taxonomy and comparative review on workflow scheduling algorithms. Then, we make a comprehensive survey of workflow scheduling in cloud environment in a problem–solution manner. Based on the analysis, we also highlight some research directions for future investigation.

APPT 2013 Revised Selected Papers of the 10th International Symposium on Advanced Parallel Processing Technologies - Volume 8299 | 2013

A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture

Fuhui Wu; Qingbo Wu; Yusong Tan; Lifeng Wei; Lisong Shao; Long Gao

The K-Means algorithms is one of the most popular and effective clustering algorithms for many practical applications. However, direct K-Means methods, taking objects as processing unit, is computationally expensive especially in Objects-Assignment phase on Single-Instruction Single-Data SISD processors, typically as CPUs. In this paper, we propose a vectorized K-Means algorithm for Intel Many Integrated Core MIC coprocessor, a newly released product from Intel for highly parallel workloads. This new algorithm is able to achieve fine-grained Single-Instruction Multiple-Data SIMD parallelism by taking each dimension of all objects as a long vector. This vectorized algorithm is suitable for any-dimensional objects, which is little taken into consideration in preceding works. We also parallelize the vectorized K-Means algorithm on MIC coprocessor to achieve coarse-grained thread-level parallelism. Finally, we implement and evaluate the vectorized method on the first generation of Intel MIC product. Measurements show that this algorithm based on MIC coprocessor gets desired speedup to sequential algorithm on CPU and demonstrate that MIC coprocessor owns highly parallel computational power as well as scalability.

international conference on algorithms and architectures for parallel processing | 2015

Maximize Throughput Scheduling and Cost-Fairness Optimization for Multiple DAGs with Deadline Constraint

Wei Wang; Qingbo Wu; Yusong Tan; Fuhui Wu

More and more application workflows are computed in cloud and most of them can be expressed by Directed Acyclic Graph DAG. As Cloud resource providers, they should guarantee as many as possible DAGs be accomplished within their deadline when they face the overstep request of computer resource. In this paper, we define the urgency of DAG and introduce the MTMD Maximize Throughput of Multi-DAG with Deadline algorithm to improve the ratio of DAGs which can be accomplished within deadline. The urgency of DAG is changing among execution and determine the execution order of tasks. We can detect DAGs which will exceed the deadline by this algorithm and abandon these DAGs timely. Based on the MTMD algorithm, we put forward the CFS Cost Fairness Scheduling algorithm to reduce the unfairness of cost between different DAGs. The simulation results show that the MTMD algorithm outperforms three other algorithms and the CFS algorithm reduces the cost of all DAGs by 12.1i?ź% on average and reduces the unfairness among DAGs by 54.5i?ź% on average.

international conference on algorithms and architectures for parallel processing | 2015

Unified Multi-constraint and Multi-objective Workflow Scheduling for Cloud System

Fuhui Wu; Qingbo Wu; Yusong Tan; Wei Wang

With the development of cloud computing, the problem of scheduling workflow in cloud system attracts a large amount of attention. In general, the cloud workflow scheduling problem requires to consider a variety of optimization objectives with some constraints. Traditional workflow scheduling methods focus on single optimization goal like makespan and single constraint like deadline or budget. In this paper, we first make a unified formalization of the optimality problem of multi-constraint and multi-objective cloud workflow scheduling using pareto optimality theory. We also present a two-constraint and two-objective case study, considering deadline, budget constraints and energy consumption, reliability objectives. A general list scheduling algorithm and a tuning mechanism are designed to solve this problem. Through extensive experimental, it confirms the efficiency of the unified multi-constraint and multi-objective cloud workflow scheduling system.

The Journal of Supercomputing | 2016

Resource stealing: a resource multiplexing method for mix workloads in cloud system

Yusong Tan; Fuhui Wu; Qingbo Wu; Xiangke Liao

The cloud computing paradigm enables providing resources on demand. However, most of them focus on a single type of application requiring separate quality of service. In the context that mix heterogeneous workloads are co-scheduled in the cloud, resource multiplexing is the key to improve resource utilization under premise of performance guaranteing. In this paper, we propose a resource stealing mechanism to improve resource multiplexing of cloud resources. It enables free resource fragments reserved by some workloads being utilized by others. To meet certain service level agreement, resource preemption is adopted as a complement to resource stealing. It ensures each workload with a minimum amount of resources when required. Moreover, we propose an adaptive joint resource provisioning algorithm. It integrates our resource multiplexing method into elastic resource provisioning. Experimental results reveal that the proposed algorithms improve resource utilization and workload performance simultaneously.

international symposium on distributed computing | 2015

Analysis Range of Coefficients in Learning Rate Methods of Convolution Neural Network

Jiang Zou; Qingbo Wu; Yusong Tan; Fuhui Wu; Wenzhu Wang

Convolutional Neural Network (CNN) is a type of feed-forward artificial neural network, exploiting the unknown structure in input distribution to discover good representations with multiple layers of small neuron collections. CNN uses relatively little pre-processing compared to other classification algorithms, usually uses gradient decent to updates the parameters in the network. Since CNN was introduced in 1997s to deal with face recognition, it has made much achievement in many fields, and has been the state-of-the-art method in face recognition, speech recognition, etc. To get small error rate and a better speed of training the CNNs, a lot of learning rate methods are proposed. In this paper, we analyzed the range of the coefficients in these methods with a restriction of max convergence constant step learning rate. In our experiments, we find the max convergence learning rate by dichotomy with little computation cost, we also confirm the range of coefficients is useful. Moreover, we gives a comparison among these methods on speed and error rate.

computer-aided design and computer graphics | 2013

Speeding Up SIFT Algorithm by Multi-core Processor Supporting SIMD Instruction Sets

Fuhui Wu; Qingbo Wu; Yusong Tan; Xiaoli Sun

Scale Invariant Feature Transform (SIFT) method plays a critical role in a wide variety of vision applications. But it is now facing the real-time computational challenge. Parallel computing is one of the most promising solutions to overcome the computational challenge. In this paper, we target at parallelizing SIFT by multi-core architecture with per-core SIMD support. We focus on the SIMDization of data parallel parts of SIFT to fully utilize per-core computing power. At Orientation Assignment and Key point Descriptor stages, we observe that load balance is an important factor. We also implement the optimized algorithm on multi-core system with SIMD support from Tianhe-2 Supercomputer and make comparison with the State-of-the-Art parallel SIFT algorithms.

International Conference on Trustworthy Computing and Services | 2012

Comparison and Performance Analysis of Join Approach in MapReduce

Fuhui Wu; Qingbo Wu; Yusong Tan

MapReduce framework has become a general programming model. MapReduce proved its superiority in fields like sorting, full-text searching. However, as demands become complicated, MapReduce could not directly support relational algebra, typically as join, on heterogeneous data source. We discusses the factors that influence the performance when implementing join both in map function and in reduce function. We also conduct implementation and make analysis. Experimental result shows that the first approach wins in situation that datasets involved in join have significant difference in size and one of them is small enough. In order to get advantages of the first approach, we conduct further discuss when the smaller dataset grows and improve it.

international conference on cloud computing | 2015

Schedule Compaction and Deadline Constrained DAG Scheduling for IaaS Cloud

Fuhui Wu; Qingbo Wu; Yusong Tan; Wei Wang; Xiaoli Sun

Most cloud workflow scheduling algorithms assume that resources are charged under an ideal pay-as-you-go model, which may not be the case in real production cloud systems. Currently, most IaaS cloud providers charge users on billing cycle basis. If a resource is terminated before one billing cycle, the payment is still rounded upi?źto one cycle. To address this problem, we firstly formalized it using bin-package method. Then, we propose a DAG schedule compaction algorithm of IC-

international conference on computer science and network technology | 2013