Kefeng Deng
National University of Defense Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kefeng Deng.
ieee international conference on high performance computing data and analytics | 2013
Kefeng Deng; Junqiang Song; Kaijun Ren; Alexandru Iosup
Long-term execution of scientific applications often leads to dynamic workloads and varying application requirements. When the execution uses resources provisioned from IaaS clouds, and thus consumption-related payment, efficient and online scheduling algorithms must be found. Portfolio scheduling, which selects dynamically a suitable policy from a broad portfolio, may provide a solution to this problem. However, selecting online the right policy from possibly tens of alternatives remains challenging. In this work, we introduce an abstract model to explore this selection problem. Based on the model, we present a comprehensive portfolio scheduler that includes tens of provisioning and allocation policies. We propose an algorithm that can enlarge the chance of selecting the best policy in limited time, possibly online. Through trace-based simulation, we evaluate various aspects of our portfolio scheduler, and find performance improvements from 7% to 100% in comparison with the best constituent policies and high improvement for bursty workloads.
international conference on parallel processing | 2013
Siqi Shen; Kefeng Deng; Alexandru Iosup; Dick H. J. Epema
Deploying applications in leased cloud infrastructure is increasingly considered by a variety of business and service integrators. However, the challenge of selecting the leasing strategy -- larger or faster instances? on-demand or reserved instances? etc.-- and to configure the leasing strategy with appropriate scheduling policies is still daunting for the (potential) cloud user. In this work, we investigate leasing strategies and their policies from a brokers perspective. We propose, CoH, a family of Cloud-based, online, Hybrid scheduling policies that minimizes rental cost by making use of both on-demand and reserved instances. We formulate the resource provisioning and job allocation policies as Integer Programming problems. As the policies need to be executed online, we limit the time to explore the optimal solution of the integer program, and compare the obtained solution with various heuristics-based policies; then automatically pick the best one. We show, via simulation and using multiple real-world traces, that the hybrid leasing policy can obtain significantly lower cost than typical heuristics-based policies.
job scheduling strategies for parallel processing | 2013
Kefeng Deng; Ruben Verboon; Kaijun Ren; Alexandru Iosup
The popularity of data centers in scientific computing has led to new architectures, new workload structures, and growing customer-bases. As a consequence, the selection of efficient scheduling algorithms for the data center is an increasingly costlier and more difficult challenge. To address this challenge, and contrasting previous work on scheduling for scientific workloads, we focus in this work on portfolio scheduling—here, the dynamic selection and use of a scheduling policy, depending on the current system and workload conditions, from a portfolio of multiple policies. We design a periodic portfolio scheduler for the workload of the entire data center, and equip it with a portfolio of resource provisioning and allocation policies. Through simulation based on real and synthetic workload traces, we show evidence that portfolio scheduling can automatically select the scheduling policy to match both user and data center objectives, and that portfolio scheduling can perform well in the data center, relative to its constituent policies.
Concurrency and Computation: Practice and Experience | 2013
Kefeng Deng; Kaijun Ren; Junqiang Song; Dong Yuan; Yang Xiang; Jinjun Chen
Due to its advantages of cost‐effectiveness, on‐demand provisioning and easy for sharing, cloud computing has grown in popularity with the research community for deploying scientific applications such as workflows. Although such interests continue growing and scientific workflows are widely deployed in collaborative cloud environments that consist of a number of data centers, there is an urgent need for exploiting strategies which can place application datasets across globally distributed data centers and schedule tasks according to the data layout to reduce both latency and makespan for workflow execution. In this paper, by utilizing dependencies among datasets and tasks, we propose an efficient data and task coscheduling strategy that can place input datasets in a load balance way and meanwhile, group the mostly related datasets and tasks together. Moreover, data staging is used to overlap task execution with data transmission in order to shorten the start time of tasks. We build a simulation environment on Tianhe supercomputer for evaluating the proposed strategy and run simulations by random and realistic workflows. The results demonstrate that the proposed strategy can effectively improve scheduling performance while reducing the total volume of data transfer across data centers. Concurrency and Computation: Practice and Experience, 2013.© 2013 Wiley Periodicals, Inc.
grid computing | 2011
Kefeng Deng; Junqiang Song; Kaijun Ren; Dong Yuan; Jinjun Chen
Recently, cloud computing has emerged as a promising computing infrastructure for performing scientific workflows by providing on-demand resources. Meanwhile, it is convenient for scientific collaboration since different cloud environments used by the researchers are connected through Internet. However, the significant latency arising from frequent access to large datasets and the corresponding data movements across geo-distributed data centers has been an obstacle to hinder the efficient execution of data-intensive scientific workflows. In this paper, we propose a novel graph-cut based data and task co scheduling strategy for minimizing the data transfer across geo-distributed data centers. Specifically, a dependency graph is firstly constructed from workflow provenance and cut into sub graphs according to the datasets which must appear in fixed data centers by a multiway cut algorithm. Then, the sub graphs might be recursively cut into smaller ones by a minimum cut algorithm referring to data correlation rules until all of them can well fit the capacity constraints of the data centers where the fixed location datasets reside. In this way, the datasets and tasks are distributed into target data centers while the total amount of data transfer between them is minimized. Additionally, a runtime scheduling algorithm is exploited to dynamically adjust the data placement during execution to prevent the data centers from overloading. Simulation results demonstrate that the total volume of data transfer across different data centers can be significantly reduced and the cost of performing scientific workflows on the clouds will be accordingly saved.
ieee international conference on dependable, autonomic and secure computing | 2011
Kefeng Deng; Lingmei Kong; Junqiang Song; Kaijun Ren; Dong Yuan
Due to the advantages of cost-effectiveness, on-demand resource provision and easy for sharing, cloud computing has grown in popularity with research community for deploying scientific applications such as workflows. When such interest continues growing and workflows are widely performed in collaborative cloud environments that consist of a number of data centers, there is an urgent need for exploiting strategies which can place the application data across globally distributed data centers and schedule tasks according to the data layout to reduce both the latency and make span for workflow execution. In this paper, by utilising dependencies among datasets and tasks, we propose an efficient data and task co scheduling strategy that can place input datasets in a load balance way and meanwhile group the mostly related datasets and tasks together. We build a simulation environment on Tianhe supercomputer to evaluate the proposed strategy and run simulations by random and realistic workflows. The results demonstrate that the proposed strategy can effectively improve workflows performance while reducing the total volume of data transfer across data centers.
international conference on cloud and green computing | 2012
Kefeng Deng; Kaijun Ren; Junqiang Song
Virtualization enables server consolidation which maximizes resource utilization by running multiple virtual machines simultaneously on the computer platform. One of the challenges faced by server consolidation is resource contention among the virtual machines. This problem will be further deteriorated on modern simultaneous multithreading (SMT) processors. Traditional symbiotic scheduling algorithms seek to co-schedule threads that have complementary resource requirements to reduce resource contention and boost system performance. However, this technique cannot be directly applied in virtualized environments since the applications are encapsulated in virtual machines. This paper proposed a symbiotic scheduling approach to improve the performance of concurrent workloads running in virtual machines on SMT processors. Our approach samples the resource demands of threads in virtual machines, passes the sampling data and thread to VCPU mapping information to the privileged domain in order for computing VCPU symbiosis. A VCPU scheduling algorithm is devised which dynamically pins the VCPUs according to their symbiosis. For load balance, a portion of less affinitive VCPUs are unpinned to provide the flexibility for VCPU migration. We have implemented a prototype on Xen and Experimental results show that the proposed approach can improve the performance of concurrent workloads by up to 24% in comparison with bare Credit scheduler.
IEEE Transactions on Cloud Computing | 2015
Kefeng Deng; Kaijun Ren; Ming Zhu; Junqiang Song
Cloud computing has emerged as a promising computational infrastructure for cost-efficient workflow execution by provisioning on-demand resources in a pay-as-you-go manner. While scientific workflows require accessing community-wide resources, they usually need to be performed in collaborative cloud environments composed of multiple datacenters. Although such environments facilitate scientific collaboration, the movements of input and intermediate datasets across geographically distributed datacenters may cause intolerable latency that would hinder efficient execution of large-scale data-intensive scientific workflows. To address the problem, in this article we propose a novel multi-level K-cut graph partitioning algorithm to minimize the volume of data transfer across datacenters while satisfying load balancing and fixed data constraints. The algorithm first contracts the fixed input datasets in the same datacenter and their consuming tasks, and coarsens the contracted graph to a predefined scale in a level-by-level manner. Then, a K-cut algorithm is used to partition the resulted graph into K parts such that the cut size is minimized. After that, the partitioned graph is projected back to the original workflow graph, during which the load balancing constraint is maintained. We evaluate our algorithm using three real-world workflow applications and the results demonstrate that the proposed algorithm outperforms other state-of-the-art algorithms.
international conference on information science and technology | 2016
Shaowei Liu; Kaijun Ren; Kefeng Deng; Junqiang Song
The characteristics of cloud computing such as on-demand provisioning of virtual machines in a pay-as-you-go manner have attracted more and more scientific workflows deploying on cloud platforms. Since there are many types of virtual machines which are charged by time intervals, the difficulties of resource provisioning hinders efficient execution of scientific workflows on cloud platforms. To address the challenge, a novel task backfill based scientific workflow scheduling strategy is proposed in this paper. The strategy will use task backfill algorithm to aggregate multiple tasks on a virtual machine instance with suitable performance and fill idle time slot of virtual machines with single tasks, improving resource utilization without affecting the overall performance. Experimental results demonstrate that in comparison with widely used HEFT and IC-PCPD2 strategy, the proposed strategy can effectively reduce the execution cost of the scientific workflows and improve resource utilization while satisfying the deadline constraint.
international symposium on computer consumer and control | 2016
Shaowei Liu; Kaijun Ren; Kefeng Deng; Junqiang Song
IaaS cloud platform, which has lots of types of virtual machines charged by time intervals in a pay-as-you-go manner, has attracted more and more scientists deploying their scientific workflows on it. Performing multiple workflows on IaaS cloud platform may face the problem that different workflows have different arrival time and different priorities, and budget and deadline constraint determines part of workflows be discarded. To address the challenge, a time dependence based scheduling strategy for multiple workflows is proposed in this paper. The strategy will analysis the specific structure of each workflow, evaluate its priority according to its indictors and its relationship with other workflows, and discard part of workflows with low priority. Experimental results demonstrate that the proposed strategy can minimize the proportion of discarding workflows, improve workflow completion percentage and resource utilization while satisfying the budget and deadline constraint.