Is this you? Create Your Porfile

Gang Lu

Chinese Academy of Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gang Lu is active.

Explore More

Publication

Featured researches published by Gang Lu.

high-performance computer architecture | 2014

BigDataBench: A big data benchmark suite from internet services

Lei Wang; Jianfeng Zhan; Chunjie Luo; Yuqing Zhu; Qiang Yang; yongqiang he; Wanling Gao; Zhen Jia; Yingjie Shi; Shujie Zhang; Chen Zheng; Gang Lu; Kent Zhan; Xiaona Li; bizhu qiu

As architecture, systems, and data management communities pay greater attention to innovative big data systems and architecture, the pressure of benchmarking and evaluating these systems rises. However, the complexity, diversity, frequently changed workloads, and rapid evolution of big data systems raise great challenges in big data benchmarking. Considering the broad use of big data systems, for the sake of fairness, big data benchmarks must include diversity of data and workloads, which is the prerequisite for evaluating big data systems and architecture. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purposes mentioned above. This paper presents our joint research efforts on this issue with several industrial partners. Our big data benchmark suite-BigDataBench not only covers broad application scenarios, but also includes diverse and representative data sets. Currently, we choose 19 big data benchmarks from dimensions of application scenarios, operations/ algorithms, data types, data sources, software stacks, and application types, and they are comprehensive for fairly measuring and evaluating big data systems and architecture. BigDataBench is publicly available from the project home page http://prof.ict.ac.cn/BigDataBench. Also, we comprehensively characterize 19 big data workloads included in BigDataBench with varying data inputs. On a typical state-of-practice processor, Intel Xeon E5645, we have the following observations: First, in comparison with the traditional benchmarks: including PARSEC, HPCC, and SPECCPU, big data applications have very low operation intensity, which measures the ratio of the total number of instructions divided by the total byte number of memory accesses; Second, the volume of data input has non-negligible impact on micro-architecture characteristics, which may impose challenges for simulation-based big data architecture research; Last but not least, corroborating the observations in CloudSuite and DCBench (which use smaller data inputs), we find that the numbers of L1 instruction cache (L1I) misses per 1000 instructions (in short, MPKI) of the big data applications are higher than in the traditional benchmarks; also, we find that L3 caches are effective for the big data applications, corroborating the observation in DCBench.

Frontiers of Computer Science in China | 2012

CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications

Chunjie Luo; Jianfeng Zhan; Zhen Jia; Lei Wang; Gang Lu; Lixin Zhang; Cheng Zhong Xu; Ninghui Sun

With the explosive growth of information, more and more organizations are deploying private cloud systems or renting public cloud systems to process big data. However, there is no existing benchmark suite for evaluating cloud performance on the whole system level. To the best of our knowledge, this paper proposes the first benchmark suite CloudRank-D to benchmark and rank cloud computing systems that are shared for running big data applications. We analyze the limitations of previous metrics, e.g., floating point operations, for evaluating a cloud computing system, and propose two simple metrics: data processed per second and data processed per Joule as two complementary metrics for evaluating cloud computing systems. We detail the design of CloudRank-D that considers representative applications, diversity of data characteristics, and dynamic behaviors of both applications and system software platforms. Through experiments, we demonstrate the advantages of our proposed metrics. In several case studies, we evaluate two small-scale deployments of cloud computing systems using CloudRank-D.

symposium on reliable distributed systems | 2012

LogMaster: Mining Event Correlations in Logs of Large-Scale Cluster Systems

Xiaoyu Fu; Rui Ren; Jianfeng Zhan; Wei Zhou; Zhen Jia; Gang Lu

This paper presents a set of innovative algorithms and a system, named Log Master, for mining correlations of events that have multiple attributions, i.e., node ID, application ID, event type, and event severity, in logs of large-scale cloud and HPC systems. Different from traditional transactional data, e.g., supermarket purchases, system logs have their unique characteristics, and hence we propose several innovative approaches to mining their correlations. We parse logs into an n-ary sequence where each event is identified by an informative nine-tuple. We propose a set of enhanced apriori-like algorithms for improving sequence mining efficiency, we propose an innovative abstraction-event correlation graphs (ECGs) to represent event correlations, and present an ECGs-based algorithm for fast predicting events. The experimental results on three logs of production cloud and HPC systems, varying from 433490 entries to 4747963 entries, show that our method can predict failures with a high precision and an acceptable recall rates.

IEEE Transactions on Parallel and Distributed Systems | 2012

Precise, Scalable, and Online Request Tracing for Multitier Services of Black Boxes

Bo Sang; Jianfeng Zhan; Gang Lu; Haining Wang; Dongyan Xu; Lei Wang; Zhihong Zhang; Zhen Jia

As more and more multitier services are developed from commercial off-the-shelf components or heterogeneous middleware without source code available, both developers and administrators need a request tracing tool to (1) exactly know how a user request of interest travels through services of black boxes and (2) obtain macrolevel user request behaviors of services without manually analyzing massive logs. This need is further exacerbated by IT system “agility,” which mandates the tracing tool to provide online performance data since offline approaches cannot reflect system changes in real time. Moreover, considering the large scale of deployed services, a pragmatic tracing approach should be scalable in terms of the cost in collecting and analyzing logs. In this paper, we introduce a precise, scalable, and online request tracing tool for multitier services of black boxes. Our contributions are threefold. First, we propose a precise request tracing algorithm for multitier services of black boxes, which only uses application-independent knowledge. Second, we present a microlevel abstraction, component activity graph, to represent causal paths of each request. On the basis of this abstraction, we use dominated causal path patterns to represent repeatedly executed causal paths that account for significant fractions, and we further present a derived performance metric of causal path patterns, latency percentages of components, to enable debugging performance-in-the-large. Third, we develop two mechanisms, tracing on demand and sampling, to significantly increase the system scalability. We implement a prototype of the proposed system, called PreciseTracer, and release it as open source code. In comparison with WAP5-a black-box tracing approach, PreciseTracer achieves higher tracing accuracy and faster response time. Our experimental results also show that PreciseTracer has low overhead, and still achieves high tracing accuracy even if an aggressive sampling policy is adopted, indicating that PreciseTracer is a promising tracing tool for large-scale production systems.

ieee international symposium on workload characterization | 2011

Characterization of real workloads of web search engines

Huafeng Xi; Jianfeng Zhan; Zhen Jia; Xuehai Hong; Lei Wang; Lixin Zhang; Ninghui Sun; Gang Lu

Search is the most heavily used web application in the world and is still growing at an extraordinary rate. Understanding the behaviors of web search engines, therefore, is becoming increasingly important to the design and deployment of data center systems hosting search engines. In this paper, we study three search query traces collected from real world web search engines in three different search service providers. The first part of our study is to uncover the patterns hidden in the query traces by analyzing the variations, frequencies, and locality of query requests. Our analysis reveals that, contradicted to some previous studies, real-world query traces do not follow well-defined probability models, such as Poisson distribution and log-normal distribution. The second part of our study is to deploy the real query traces and three synthetic traces generated using probability models proposed by other researchers on a Nutch based search engine. The measured performance data from the deployments further confirm that synthetic traces do not accurately reflect the real traces. We develop an evaluation tool that can collect performance metrics on-line with negligible overhead. The performance metrics include average response time, CPU utilization, Disk accesses, and cycles-per-instructions, etc. The third of our study is to compare the search engine with representative benchmarks, namely Gridmix, SPECweb2005, TPC-C, SPECCPU2006, and HPCC, with respect to basic architecture-level characteristics and performance metrics, such as instruction mix, processor pipeline stall breakdown, memory access latency, and disk accesses. The experimental results show that web search engines have a high percentage of load/store instructions, but have good cache/memory performance. We hope those results presented in this paper will enable system designers to gain insights on optimizing systems hosting search engines.

international conference on autonomic computing | 2012

PowerTracer: tracing requests in multi-tier services to diagnose energy inefficiency

Gang Lu; Jianfeng Zhan; Haining Wang; Lin Yuan; Chuliang Weng

As energy has become one of the key operating costs in running a data center and power waste commonly exists, it is essential to observe and reduce energy inefficiency inside data centers. In this paper, we develop an innovative framework, called PowerTracer, for diagnosing energy-inefficiency. Inside the framework, we first present a resource tracing method based on request tracing in multi-tier services of black boxes. Then, we propose a generalized methodology of applying a request tracing approach for energy-inefficiency diagnosis in multi-tier service systems. With insights into service performance and resource consumption of individual requests, we develop a bottleneck diagnosis tool that pinpoints the root causes of energy inefficiency. We implement the prototype and conduct experiments to validate its effectiveness.

arXiv: Distributed, Parallel, and Cluster Computing | 2015

BigDataBench-MT: A Benchmark Tool for Generating Realistic Mixed Data Center Workloads

Rui Han; Shulin Zhan; Chenrong Shao; Junwei Wang; Lizy Kurian John; Jiangtao Xu; Gang Lu; Lei Wang

Long-running service workloads (e.g. web search engine) and short-term data analysis workloads (e.g. Hadoop MapReduce jobs) co-locate in today’s data centers. Developing realistic benchmarks to reflect such practical scenario of mixed workload is a key problem to produce trustworthy results when evaluating and comparing data center systems. This requires using actual workloads as well as guaranteeing their submissions to follow patterns hidden in real-world traces. However, existing benchmarks either generate actual workloads based on probability models, or replay real-world workload traces using basic I/O operations. To fill this gap, we propose a benchmark tool that is a first step towards generating a mix of actual service and data analysis workloads on the basis of real workload traces. Our tool includes a combiner that enables the replaying of actual workloads according to the workload traces, and a multi-tenant generator that flexibly scales the workloads up and down according to users’ requirements. Based on this, our demo illustrates the workload customization and generation process using a visual interface. The proposed tool, called BigDataBench-MT, is a multi-tenant version of our comprehensive benchmark suite BigDataBench and it is publicly available from http://prof.ict.ac.cn/BigDataBench/multi-tenancyversion/.

IEEE Transactions on Computers | 2015

PowerTracer : Tracing Requests in Multi-Tier Services to Reduce Energy Inefficiency

Gang Lu; Jianfeng Zhan; Haining Wang; Lin Yuan; Yunwei Gao; Chuliang Weng; Yong Qi

As energy has become one of the key operating costs in running a data center and power waste commonly exists, it is essential to reduce energy inefficiency inside data centers. In this paper, we develop an innovative framework, called PowerTracer, for diagnosing energy inefficiency and saving power. Inside the framework, we first present a resource tracing method based on request tracing in multi-tier services of black boxes. Then, we propose a generalized methodology of applying a request tracing approach for energy inefficiency diagnosis and power saving in multi-tier service systems. With insights into service performance and resource consumption of individual requests, we develop (1) a bottleneck diagnosis tool that pinpoints the root causes of energy inefficiency, and (2) a power saving method that enables dynamic voltage and frequency scaling (DVFS) with online request tracing. We implement a prototype of PowerTracer, and conduct extensive experiments to validate its effectiveness. Our tool analyzes several state-of-the-practice and state-of-the-art DVFS control policies and uncovers existing energy inefficiencies. Meanwhile, the experimental results demonstrate that PowerTracer outperforms its peers in power saving.

wri world congress on software engineering | 2010

Comparison of Requirement Items Based on the Requirements Change Management System of QONE

Gang Lu; Feng Yuan

Requirements changes are difficult to avoid in software development processes, and how to manage requirements change is still an open issue. This paper proposes an innovative algorithm, named LCS-NP++, for comparing different versions of requirement specifications to decide requirement changes. We have integrated this algorithm into a production software process management platform–Qone, which is developed by Institute of Software Chinese Academy of Sciences. Our practices have shown it is effective and efficient in comparing requirement items.

The Journal of Finance and Data Science | 2015