Chunjie Luo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chunjie Luo is active.

Explore More

Publication

Featured researches published by Chunjie Luo.

high-performance computer architecture | 2014

BigDataBench: A big data benchmark suite from internet services

Lei Wang; Jianfeng Zhan; Chunjie Luo; Yuqing Zhu; Qiang Yang; yongqiang he; Wanling Gao; Zhen Jia; Yingjie Shi; Shujie Zhang; Chen Zheng; Gang Lu; Kent Zhan; Xiaona Li; bizhu qiu

As architecture, systems, and data management communities pay greater attention to innovative big data systems and architecture, the pressure of benchmarking and evaluating these systems rises. However, the complexity, diversity, frequently changed workloads, and rapid evolution of big data systems raise great challenges in big data benchmarking. Considering the broad use of big data systems, for the sake of fairness, big data benchmarks must include diversity of data and workloads, which is the prerequisite for evaluating big data systems and architecture. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purposes mentioned above. This paper presents our joint research efforts on this issue with several industrial partners. Our big data benchmark suite-BigDataBench not only covers broad application scenarios, but also includes diverse and representative data sets. Currently, we choose 19 big data benchmarks from dimensions of application scenarios, operations/ algorithms, data types, data sources, software stacks, and application types, and they are comprehensive for fairly measuring and evaluating big data systems and architecture. BigDataBench is publicly available from the project home page http://prof.ict.ac.cn/BigDataBench. Also, we comprehensively characterize 19 big data workloads included in BigDataBench with varying data inputs. On a typical state-of-practice processor, Intel Xeon E5645, we have the following observations: First, in comparison with the traditional benchmarks: including PARSEC, HPCC, and SPECCPU, big data applications have very low operation intensity, which measures the ratio of the total number of instructions divided by the total byte number of memory accesses; Second, the volume of data input has non-negligible impact on micro-architecture characteristics, which may impose challenges for simulation-based big data architecture research; Last but not least, corroborating the observations in CloudSuite and DCBench (which use smaller data inputs), we find that the numbers of L1 instruction cache (L1I) misses per 1000 instructions (in short, MPKI) of the big data applications are higher than in the traditional benchmarks; also, we find that L3 caches are effective for the big data applications, corroborating the observation in DCBench.

ieee international symposium on workload characterization | 2013

Characterizing data analysis workloads in data centers

Zhen Jia; Lei Wang; Jianfeng Zhan; Lixin Zhang; Chunjie Luo

As the amount of data explodes rapidly, more and more corporations are using data centers to make effective decisions and gain a competitive edge. Data analysis applications play a significant role in data centers, and hence it has became increasingly important to understand their behaviors in order to further improve the performance of data center computer systems. In this paper, after investigating three most important application domains in terms of page views and daily visitors, we choose eleven representative data analysis workloads and characterize their micro-architectural characteristics by using hardware performance counters, in order to understand the impacts and implications of data analysis workloads on the systems equipped with modern superscalar out-of-order processors. Our study on the workloads reveals that data analysis applications share many inherent characteristics, which place them in a different class from desktop (SPEC CPU2006), HPC (HPCC), and service workloads, including traditional server workloads (SPECweb200S) and scale-out service workloads (four among six benchmarks in CloudSuite), and accordingly we give several recommendations for architecture and system optimizations. On the basis of our workload characterization work, we released a benchmark suite named DCBench for typical datacenter workloads, including data analysis and service workloads, with an open-source license on our project home page on http://prof.ict.ac.cnIDCBench. We hope that DCBench is helpful for performing architecture and small-to-medium scale system researches for datacenter computing.

Frontiers of Computer Science in China | 2012

CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications

Chunjie Luo; Jianfeng Zhan; Zhen Jia; Lei Wang; Gang Lu; Lixin Zhang; Cheng Zhong Xu; Ninghui Sun

With the explosive growth of information, more and more organizations are deploying private cloud systems or renting public cloud systems to process big data. However, there is no existing benchmark suite for evaluating cloud performance on the whole system level. To the best of our knowledge, this paper proposes the first benchmark suite CloudRank-D to benchmark and rank cloud computing systems that are shared for running big data applications. We analyze the limitations of previous metrics, e.g., floating point operations, for evaluating a cloud computing system, and propose two simple metrics: data processed per second and data processed per Joule as two complementary metrics for evaluating cloud computing systems. We detail the design of CloudRank-D that considers representative applications, diversity of data characteristics, and dynamic behaviors of both applications and system software platforms. Through experiments, we demonstrate the advantages of our proposed metrics. In several case studies, we evaluate two small-scale deployments of cloud computing systems using CloudRank-D.

arXiv: Databases | 2013

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

Zijian Ming; Chunjie Luo; Wanling Gao; Rui Han; Qiang Yang; Lei Wang; Jianfeng Zhan

Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4 V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different types (Variety) under controllable generation rates (Velocity) while keeping the important characteristics of raw data (Veracity). This gives rise to various new challenges about how we design generators efficiently and successfully. To date, most existing techniques can only generate limited types of data and support specific big data systems such as Hadoop. Hence we develop a tool, called Big Data Generator Suite (BDGS), to efficiently generate scalable big data while employing data models derived from real data to preserve data veracity. The effectiveness of BDGS is demonstrated by developing six data generators covering three representative data types (structured, semi-structured and unstructured) and three data sources (text, graph, and table data).

ieee international symposium on workload characterization | 2014

Characterizing and subsetting big data workloads

Zhen Jia; Jianfeng Zhan; Lei Wang; Rui Han; Sally A. McKee; Qiang Yang; Chunjie Luo; Jingwei Li

Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. However, using truly comprehensive benchmarks poses great challenges for the architecture community. First, we need to thoroughly understand the behaviors of a variety of workloads. Second, our usual simulation-based research methods become prohibitively expensive for big data. As big data is an emerging field, more and more software stacks are being proposed to facilitate the development of big data applications, which aggravates these challenges. In this paper, we first use Principle Component Analysis (PCA) to identify the most important characteristics from 45 metrics to characterize big data workloads from BigDataBench, a comprehensive big data benchmark suite. Second, we apply a clustering technique to the principle components obtained from the PCA to investigate the similarity among big data workloads, and we verify the importance of including different software stacks for big data benchmarking. Third, we select seven representative big data workloads by removing redundant ones and release the BigDataBench simulation version, which is publicly available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.

international parallel and distributed processing symposium | 2012

High Volume Throughput Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers

Jianfeng Zhan; Lixin Zhang; Ninghui Sun; Lei Wang; Zhen Jia; Chunjie Luo

For the first time, this paper systematically identifies three categories of throughput oriented workloads in data centers: services, data processing applications, and interactive real-time applications, whose targets are to increase the volume of throughput in terms of processed requests or data, or supported maximum number of simultaneous subscribers, respectively, and we coins a new term high volume throughput computing (in short HVC) to describe those workloads and data center systems designed for them. We characterize and compare HVC with other computing paradigms, e.g., high throughput computing, warehouse-scale computing, and cloud computing, in terms of levels, workloads, metrics, coupling degree, data scales, and number of jobs or service instances. We also preliminarily report our ongoing work on the metrics and benchmarks for HVC systems, which is the foundation of designing innovative data center systems for HVC workloads.

international conference on artificial neural networks | 2018

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks.

Chunjie Luo; Jianfeng Zhan; Xiaohe Xue; Lei Wang; Rui Ren; Qiang Yang

Traditionally, multi-layer neural networks use dot product between the output vector of previous layer and the incoming weight vector as the input to activation function. The result of dot product is unbounded, thus increases the risk of large variance. Large variance of neuron makes the model sensitive to the change of input distribution, thus results in poor generalization, and aggravates the internal covariate shift which slows down the training. To bound dot product and decrease the variance, we propose to use cosine similarity or centered cosine similarity (Pearson Correlation Coefficient) instead of dot product in neural networks, which we call cosine normalization. We compare cosine normalization with batch, weight and layer normalization in fully-connected neural networks, convolutional networks on the data sets of MNIST, 20NEWS GROUP, CIFAR-10/100, SVHN. Experiments show that cosine normalization achieves better performance than other normalization techniques.

IEEE Transactions on Parallel and Distributed Systems | 2017

Understanding Big Data Analytics Workloads on Modern Processors

Zhen Jia; Jianfeng Zhan; Lei Wang; Chunjie Luo; Wanling Gao; Yi Jin; Rui Han; Lixin Zhang

Big data analytics workloads are very significant ones in modern data centers, and it is more and more important to characterize their representative workloads and understand their behaviors so as to improve the performance of data center computer systems. In this paper, we embark on a comprehensive study to understand the impacts and performance implications of the big data analytics workloads on the systems equipped with modern superscalar out-of-order processors. After investigating three most important application domains in Internet services in terms of page views and daily visitors, we choose 11 representative data analytics workloads and characterize their micro-architectural behaviors by using hardware performance counters. Our study reveals that the big data analytics workloads share many inherent characteristics, which place them in a different class from the traditional workloads and the scale-out services. To further understand the characteristics of big data analytics workloads, we perform correlation analysis to identify the most key factors that affect cycles per instruction (CPI). Also, we reveal that the increasing complexity of the big data software stacks will put higher pressures on the modern processor pipelines.

international conference on parallel architectures and compilation techniques | 2018

Data motifs: a lens towards fully understanding big data and AI workloads

Wanling Gao; Jianfeng Zhan; Lei Wang; Chunjie Luo; Daoyi Zheng; Fei Tang; Biwei Xie; Chen Zheng; Xu Wen; Xiwen He; Hainan Ye; Rui Ren

The complexity and diversity of big data and AI workloads make understanding them difficult and challenging. This paper proposes a new approachto modelling and characterizing big data and AI workloads. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on different initial or intermediate data inputs. Each class of unit of computation captures the common requirements while being reasonably divorced from individual implementations, and hence we call it a data motif. For the first time, among a wide variety of big data and AI workloads, we identify eight data motifs that take up most of the run time of those workloads, including Matrix, Sampling, Logic, Transform, Set, Graph, Sort and Statistic. We implement the eight data motifs on different software stacks as the micro benchmarks of an open-source big data and AI benchmark suite --- BigDataBench 4.0 (publicly available from http://prof.ict.ac.cn/BigDataBench), and perform comprehensive characterization of those data motifs from perspective of data sizes, types, sources, and patterns as a lens towards fully understanding big data and AI workloads. We believe the eight data motifs are promising abstractions and tools for not only big data and AI benchmarking, but also domain-specific hardware and software co-design.

arXiv: Information Retrieval | 2013

BigDataBench:a Big Data Benchmark Suite from Web Search Engines

leiwang; Jianfeng Zhan; Chunjie Luo; bizhu qiu; yongqiang he; Yong qi; shimin gong; zhiguo li; Xiaona Li; wanlin gao; Yuqing Zhu; Zhen Jia; shujie zhag

Explore More