Is this you? Create Your Porfile

Huanle Xu

The Chinese University of Hong Kong

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Huanle Xu is active.

Explore More

Publication

Featured researches published by Huanle Xu.

international symposium on information theory | 2014

Regenerating Codes over a Binary Cyclic Code

Kenneth W. Shum; Hanxu Hou; Minghua Chen; Huanle Xu; Hui Li

We present a design framework of regenerating codes for distributed storage systems which employ binary additions and bit-wise cyclic shifts as the basic operations. The proposed coding method can be regarded as a concatenation coding scheme with the outer code being a binary cyclic code, and the inner code a regenerating code utilizing the binary cyclic code as the alphabet set. The advantage of this approach is that encoding and repair of failed node can be done with low computational complexity. It is proved that the proposed coding method can achieve the fundamental tradeoff curve between the storage and repair bandwidth asymptotically when the size of the data file is large.

international conference on computer communications | 2015

Optimization for speculative execution in a MapReduce-like cluster

Huanle Xu; Wing Cheong Lau

A parallel processing job can be delayed substantially as long as one of its many tasks is being assigned to an unreliable machine. To tackle this so-called straggler problem, most parallel processing frameworks such as MapReduce have adopted various strategies under which the system may speculatively launch additional copies of the same task if its progress is abnormally slow or simply because extra idling resource is available. In this paper, we focus on the design of speculative execution schemes for a parallel processing cluster under different loading conditions. For the lightly loaded case, we analyze and propose two optimization-based schemes, namely, the Smart Cloning Algorithm (SCA) which is based on maximizing the job utility. We also derive the workload threshold under which SCA should be used for speculative execution. Our simulation results show SCA can reduce the total job flowtime by nearly 22% comparing to the speculative execution strategy of Microsoft Mantri. For the heavily loaded case, we propose the Enhanced Speculative Execution (ESE) algorithm which is an extension of the Microsoft Mantri scheme. We show that the ESE algorithm can beat the Mantri baseline scheme by 35% in terms of job flowtime while consuming the same amount of resource.

international conference on cloud computing | 2014

Speculative Execution for a Single Job in a MapReduce-Like System

Huanle Xu; Wing Cheong Lau

Parallel processing plays an important role for large-scale data analytics. It breaks a job into many small tasks which run parallel on multiple machines such as MapReduce framework. One fundamental challenge faced to such parallel processing is the straggling tasks as they can delay the completion of a job seriously. In this paper, we focus on the speculative execution issue which is used to deal with the straggling problem in the literature. We present a theoretical framework for the optimization of a single job which differs a lot from the previous heuristics-based work. More precisely, we propose two schemes when the number of parallel tasks the job consists of is smaller than cluster size. In the first scheme, no monitoring is needed and we can provide the job deadline guarantee with a high probability while achieve the optimal resource consumption level. The second scheme needs to monitor the task progress and makes the optimal number of duplicates when the straggling problem happens. On the other hand, when the number of tasks in a job is larger than the cluster size, we propose an Enhanced Speculative Execution (ESE) algorithm to make the optimal decision whenever a machine is available for a new scheduling. The simulation results show the ESE algorithm can reduce the job flow time by 50% while consume fewer resources comparing to the strategy without backup.

international conference on network protocols | 2013

Resource optimization for speculative execution in a MapReduce Cluster

Huanle Xu; Wing Cheong Lau

The MapReduce paradigm is now the de facto standard for large-scale data analytics. In this paper we address the resource management issues in MapReduce Cluster. Speculative execution (task backup) plays an important role in resource management. We propose two different strategies and build two models to formulate the backup issue as an optimization problem when the cluster is lightly loaded. Moreover, we present an Enhanced Speculative Execution (ESE) algorithm when the cluster is heavily loaded and adopt the approximate analysis to get an optimal value for the parameter in the algorithm. The simulation results show that the algorithm can reduce the job completion time by 50% while consuming much less resource compared to the naive method without backup.

international conference on communications | 2016

Accelerating graph mining algorithms via uniform random edge sampling

Ruohan Gao; Huanle Xu; Pili Hu; Wing Cheong Lau

The seminal works by Karger [13], [14] have shown that one can use Uniform Random Edge (URE) sampling to generate a graph skeleton which accurately approximates all cut-values in the original graph with high probability under some specific assumptions. As such, the random subgraphs resulted from URE sampling can often be used as substitutes for the original graphs in cut/flow-related graph-optimization problems [14]. In this paper, we extend the results of Karger to show that, besides the value (weight) of the cut-set, the weights of four additional types of edge-set, namely, Volume, Association, Complement Volume and Complement Association, are all well-preserved under URE sampling. More importantly, we show that these well-preserved edge-set metrics have dominant impact on the outcome of common graph-mining tasks including PageRank computation and Community Detection. As a result, URE sampling can be used to accelerate the corresponding graph-mining algorithms with small approximation errors. Via extensive experiments with large-scale graphs in practice, we demonstrate that URE sampling can achieve over 90% accuracy for PageRank computation and Modularity-based Community Detection by sampling only 20% edges of the original graph.

international conference on computer communications | 2017

Addressing job processing variability through redundant execution and opportunistic checkpointing: A competitive analysis

Huanle Xu; Gustavo de Veciana; Wing Cheong Lau

The completion times of jobs in a computing cluster may be influenced by a variety of factors including job size and machine processing variability. In this paper, we explore online resource allocation policies which combine size-dependent scheduling with redundant execution and opportunistic checkpointing to minimize the overall job flowtime. We introduce a simplified model for the job service capacity of a computing cluster while leveraging redundant execution/checkpointing. In this setting, we propose two resource allocation algorithms, SRPT+R and LAPS+R(β) subject to checkpointing overhead not exceeding the number of jobs which are processed. We provide new theoretical performance bounds for these algorithms: SRPT+R is shown to be O(1/∊) competitive under (1 + ∊)-speed resource augmentation, while LAPS+R(β) is shown to be O(1/β∊) competitive under (2+ 2β + 2∊)-speed resource augmentation.

international conference on computer communications | 2015

DPCP: A protocol for optimal pull coordination in decentralized social networks

Huanle Xu; Pili Hu; Wing Cheong Lau; Qiming Zhang; Yang Wu

Social Networking Service has become an essential part of our life today. However, many privacy concerns have recently been raised due to the centralized nature of such services. Decentralized Social Network (DSN) is believed to be a viable solution for these problems. In this paper, we design a protocol to coordinate the pulling operation of DSN nodes. The protocol is the result of forward engineering via utility maximization that takes communication layer congestion level as well as social network layer centrality into consideration. We solve the pulling rate control problem using the primal-dual approach and prove that the protocol can converge quickly when executed in a decentralized manner. Furthermore, we develop a novel “drumbeats” algorithm to estimate node centrality purely based on passively-observed information. Simulation results show that our protocol reduces the average message propagation delay by 15% when comparing to the baselined Fixed Equal Gap Pull protocol. In addition, the estimated node centrality matches well with the ground-truth derived from the actual topology of the social network.

international conference on algorithms and architectures for parallel processing | 2015

Solving Large Graph Problems in MapReduce-Like Frameworks via Optimized Parameter Configuration

Huanle Xu; Ronghai Yang; Zhibo Yang; Wing Cheong Lau

In this paper, we propose a scheme to solve large dense graph problems under the MapReduce framework. The graph data is organized in terms of blocks and all blocks are assigned to different map workers for parallel processing. Intermediate results of map workers are combined by one reduce worker for the next round of processing. This procedure is iterative and the graph size can be reduced substantially after each round. In the last round, a small graph is processed on one single map worker to produce the final result. Specifically, we present some basic algorithms like Minimum Spanning Tree, Finding Connected Components and Single-Source Shortest Path which can be implemented efficiently using this scheme. We also offer a mathematical formulation to determine the parameters under our scheme so as to achieve the optimal running-time performance. Note that the proposed scheme can be applied in MapReduce-like platforms such as Spark. We use our own cluster and Amazon EC2 as the testbeds to respectively evaluate the performance of the proposed Minimum Spanning Tree algorithm under the MapReduce and Spark frameworks. The experimental results match well with our theoretical analysis. Using this approach, many parallelizable problems can be solved in MapReduce-like frameworks efficiently.

international conference on distributed computing systems | 2015