Shicong Meng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shicong Meng is active.

Explore More

Publication

Featured researches published by Shicong Meng.

high performance distributed computing | 2014

MRONLINE: MapReduce online performance tuning

Min Li; Liangzhao Zeng; Shicong Meng; Jian Tan; Li Zhang; Ali Raza Butt; Nicholas C. M. Fuller

MapReduce job parameter tuning is a daunting and time consuming task. The parameter configuration space is huge; there are more than 70 parameters that impact job performance. It is also difficult for users to determine suitable values for the parameters without first having a good understanding of the MapReduce application characteristics. Thus, it is a challenge to systematically explore the parameter space and select a near-optimal configuration. Extant offline tuning approaches are slow and inefficient as they entail multiple test runs and significant human effort. To this end, we propose an online performance tuning system, MRONLINE, that monitors a jobs execution, tunes associated performance-tuning parameters based on collected statistics, and provides fine-grained control over parameter configuration. MRONLINE allows each task to have a different configuration, instead of having to use the same configuration for all tasks. Moreover, we design a gray-box based smart hill climbing algorithm that can efficiently converge to a near-optimal configuration with high probability. To improve the search quality and increase convergence speed, we also incorporate a set of MapReduce-specific tuning rules in MRONLINE. Our results using a real implementation on a representative 19-node cluster show that dynamic performance tuning can effectively improve MapReduce application performance by up to 30% compared to the default configuration used in YARN.

international conference on computer communications | 2013

Improving ReduceTask data locality for sequential MapReduce jobs

Jian Tan; Shicong Meng; Xiaoqiao Meng; Li Zhang

Improving data locality for MapReduce jobs is critical for the performance of large-scale Hadoop clusters, embodying the principle of moving computation close to data for big data platforms. Scheduling tasks in the vicinity of stored data can significantly diminish network traffic, which is crucial for system stability and efficiency. Though the issue on data locality has been investigated extensively for MapTasks, most of the existing schedulers ignore data locality for ReduceTasks when fetching the intermediate data, causing performance degradation. This problem of reducing the fetching cost for ReduceTasks has been identified recently. However, the proposed solutions are exclusively based on a greedy approach, relying on the intuition to place ReduceTasks to the slots that are closest to the majority of the already generated intermediate data. The consequence is that, in presence of job arrivals and departures, assigning the ReduceTasks of the current job to the nodes with the lowest fetching cost can prevent a subsequent job with even better match of data locality from being launched on the already taken slots. To this end, we formulate a stochastic optimization framework to improve the data locality for ReduceTasks, with the optimal placement policy exhibiting a threshold-based structure. In order to ease the implementation, we further propose a receding horizon control policy based on the optimal solution under restricted conditions. The improved performance is further validated through simulation experiments and real performance tests on our testbed.

european conference on computer systems | 2014

DynMR: dynamic MapReduce with ReduceTask interleaving and MapTask backfilling

Jian Tan; Alicia Chin; Zane Zhenhua Hu; Yonggang Hu; Shicong Meng; Xiaoqiao Meng; Li Zhang

In order to improve the performance of MapReduce, we design DynMR. It addresses the following problems that persist in the existing implementations: 1) difficulty in selecting optimal performance parameters for a single job in a fixed, dedicated environment, and lack of capability to configure parameters that can perform optimally in a dynamic, multi-job cluster; 2) long job execution resulting from a task long-tail effect, often caused by ReduceTask data skew or heterogeneous computing nodes; 3) inefficient use of hardware resources, since ReduceTasks bundle several functional phases together and may idle during certain phases. DynMR adaptively interleaves the execution of several partially-completed ReduceTasks and backfills MapTasks so that they run in the same JVM, one at a time. It consists of three components. 1) A running ReduceTask uses a detection algorithm to identify resource underutilization during the shuffle phase. It then gives up the allocated hardware resources efficiently to the next task. 2) A number of ReduceTasks are gradually assembled in a progressive queue, according to a flow control algorithm in runtime. These tasks execute in an interleaved rotation. Additional ReduceTasks can be inserted adaptively to the progressive queue if the full fetching capacity is not reached. MapTasks can be back-filled therein if it is still underused. 3) Merge threads of each ReduceTask are extracted out as standalone services within the associated JVM. This design allows the data segments of multiple partially-complete ReduceTasks to reside in the same JVM heap, controlled by a segment manager and served by the common merge threads. Experiments show 10% ~ 40% improvements, depending on the workload.

ieee international conference on high performance computing data and analytics | 2015

HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computing

Yandong Wang; Li Zhang; Jian Tan; Min Li; Yuqing Gao; Xavier R. Guerin; Xiaoqiao Meng; Shicong Meng

In this paper, we describe our experiences and lessons learned from building a general-purpose in-memory key-value middleware, called HydraDB. HydraDB synthesizes a collection of state-of-the-art techniques, including continuous fault-tolerance, Remote Direct Memory Access (RDMA), as well as awareness for multicore systems, etc, to deliver a high-throughput, low-latency access service in a reliable manner for cluster computing applications. The uniqueness of HydraDB mainly lies in its design commitment to fully exploit the RDMA protocol to comprehensively optimize various aspects of a general-purpose key-value store, including latency-critical operations, read enhancement, and data replications for high-availability service, etc. At the same time, HydraDB strives to efficiently utilize multicore systems to prevent data manipulation on the servers from curbing the potential of RDMA. Many teams in our organization have adopted HydraDB to improve the execution of their cluster computing frameworks, including Hadoop, Spark, Sensemaking analytics, and Call Record Processing. In addition, our performance evaluation with a variety of YCSB workloads also shows that HydraDB can substantially outperform several existing in-memory key-value stores by an order of magnitude. Our detailed performance evaluation further corroborates our design choices.

allerton conference on communication, control, and computing | 2012

Delay asymptotics for heavy-tailed MapReduce jobs

Jian Tan; Shicong Meng; Xiaoqiao Meng; Li Zhang

A MapReduce job consists of two phases that are processed in a map queue and a redeuce queue, respectively. The map queue is characterized by the processor sharing discpline, and the reduce queue by a multi-server station. A reduce task is composed of two sequential steps: the copy/shuffle step and the reduce function step. A synchronization barrier between the map and reduce phases complicates the process: the copy/shuffle step can overlap with the map phase, but its finish point and the start of the reduce function step have to be strictly after the completion of the map phase of the same job. This dependency can result in an interesting criticality phenomenon for the job delay distribution in MapReduce scheduling. We refine the logarithmic asymptotics that has been established for heavy-tailed MapReduce jobs by studying the exact asymptotics. The analysis reveals that the MapReduce framework combines the features of both processor sharing and first in first out disciplines.

international conference on autonomic computing | 2013