Hucheng Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hucheng Zhou is active.

Explore More

Publication

Featured researches published by Hucheng Zhou.

IEEE Transactions on Parallel and Distributed Systems | 2015

Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE

Xuepeng Fan; Zhenyu Guo; Hai Jin; Xiaofei Liao; Jiaxing Zhang; Hucheng Zhou; Sean McDirmid; Wei Lin; Jingren Zhou; Lidong Zhou

To minimize the amount of data-shuffling I/O that occurs between the pipeline stages of a distributed data-parallel program, its procedural code must be optimized with full awareness of the pipeline that it executes in. Unfortunately, neither pipeline optimizers nor traditional compilers examine both the pipeline and procedural code of a data-parallel program so programmers must either hand-optimize their program across pipeline stages or live with poor performance. To resolve this tension between performance and programmability, this paper describes PeriSCOPE, which automatically optimizes a data-parallel programs procedural code in the context of data flow that is reconstructed from the programs pipeline topology. Such optimizations eliminate unnecessary code and data, perform early data filtering, and calculate small derived values (e.g., predicates) earlier in the pipeline, so that less data - sometimes much less data - is transferred between pipeline stages. PeriSCOPE further leverages symbolic execution to enlarge the scope of such optimizations by eliminating dead code. We describe how PeriSCOPE is implemented and evaluate its effectiveness on real production jobs.

international conference on software engineering | 2014

Nondeterminism in MapReduce considered harmful? an empirical study on non-commutative aggregators in MapReduce programs

Tian Xiao; Jiaxing Zhang; Hucheng Zhou; Zhenyu Guo; Sean McDirmid; Wei Lin; Wenguang Chen; Lidong Zhou

The simplicity of MapReduce introduces unique subtleties that cause hard-to-detect bugs; in particular, the unfixed order of reduce function input is a source of nondeterminism that is harmful if the reduce function is not commutative and sensitive to input order. Our extensive study of production MapReduce programs reveals interesting findings on commutativity, nondeterminism, and correctness. Although non-commutative reduce functions lead to five bugs in our sample of well-tested production programs, we surprisingly have found that many non-commutative reduce functions are mostly harmless due to, for example, implicit data properties. These findings are instrumental in advancing our understanding of MapReduce program correctness.

international conference on software engineering | 2015

An empirical study on quality issues of production big data platform

Hucheng Zhou; Jian-Guang Lou; Hongyu Zhang; Haibo Lin; Haoxiang Lin; Tingting Qin

Big Data computing platform has evolved to be a multi-tenant service. The service quality matters because system failure or performance slowdown could adversely affect business and user experience. There is few study in literature on service quality issues of production Big Data computing platform. In this paper, we present an empirical study on the service quality issues of Microsoft ProductA, which is a company-wide multi-tenant Big Data computing platform, serving thousands of customers from hundreds of teams. ProductA has a well-defined incident management process, which helps customers report and mitigate service quality issues on 24/7 basis. This paper explores the common symptom, causes and mitigation of service quality issues in Big Data computing. We conduct an empirical study on 210 real service quality issues in ProductA. Our major findings include (1) 21.0% of escalations are caused by hardware faults; (2) 36.2% are caused by system side defects; (3) 37.2% are due to customer side faults. We also studied the general diagnosis process and the commonly adopted mitigation solutions. Our findings can help improve current development and maintenance practice of Big Data computing platform, and motivate tool support.

programming language design and implementation | 2011

An SSA-based algorithm for optimal speculative code motion under an execution profile

Hucheng Zhou; Wenguang Chen; Fred C. Chow

To derive maximum optimization benefits from partial redundancy elimination (PRE),it is necessary to go beyond its safety constraint. Algorithms for optimal speculative code motion have been developed based on the application of minimum cut to flow networks formed out of the control flow graph. These previous techniques did not take advantage of the SSA form, which is a popular program representation widely used in modern-day compilers. We have developed the MC-SSAPRE algorithm that enables an SSA-based compiler to take full advantage of SSA to perform optimal speculative code motion efficiently when an execution profile is available. Our work shows that it is possible to form flow networks out of SSA graphs, and the min-cut technique can be applied equally well on these flow networks to find the optimal code placement. We provide proofs of the correctness and computational and lifetime optimality of MC-SSAPRE. We analyze its time complexity to show its efficiency advantage. We have implemented MC-SSAPRE in the open-sourced Path64 compiler. Our experimental data based on the full SPEC CPU2006 Benchmark Suite show that MC-SSAPRE can further improve program performance over traditional SSAPRE, and that our sparse approach to the problem does result in smaller problem sizes.

symposium on cloud computing | 2014

Automating Distributed Partial Aggregation

Chang Liu; Jiaxing Zhang; Hucheng Zhou; Sean McDirmid; Zhenyu Guo; Thomas Moscibroda

Partial aggregation is of great importance in many distributed data-parallel systems. Most notably, it is commonly applied by MapReduce programs to optimize I/O by successively aggregating partially reduced results into a final result, as opposed to aggregating all input records at once. In spite of its importance, programmers currently enable partial aggregation by tediously encoding their reduce functionality into separate reduce and combine functions. This is error prone and often leads to missed optimization opportunities. This paper proposes an algorithm that automatically verifies if the original monolithic reduce function of a MapReduce program is eligible for partial aggregation, and if so, synthesizes enabling partial aggregation code. The key insight behind this algorithm is a novel necessary and sufficient condition for when partial aggregation is applicable to a reduce function. This insight provides us with a formal foundation for an automaton, which derives a satisfiability problem that can be fed into a standard SMT solver. By doing so, we transform the problem of synthesis into a program inversion problem, which is however nondeterministic. Although such inversion is hard to solve in general, we observe that most reducers in practical distributed computing contexts can be classified into a few categories for which we can design efficient synthesis algorithms. Finally, we build and evaluate a prototype of our method to demonstrate its feasibility in the SCOPE distributed data-parallel system.

conference on object-oriented programming systems, languages, and applications | 2014

Cybertron: pushing the limit on I/O reduction in data-parallel programs

Tian Xiao; Zhenyu Guo; Hucheng Zhou; Jiaxing Zhang; Xu Zhao; Chencheng Ye; Xi Wang; Wei Lin; Wenguang Chen; Lidong Zhou

I/O reduction has been a major focus in optimizing data-parallel programs for big-data processing. While the current state-of-the-art techniques use static program analysis to reduce I/O, Cybertron proposes a new direction that incorporates runtime mechanisms to push the limit further on I/O reduction. In particular, Cybertron tracks how data is used in the computation accurately at runtime to filter unused data at finer granularity dynamically, beyond what current static-analysis based mechanisms are capable of, and to facilitate a new mechanism called constraint based encoding for more efficient encoding. Cybertron has been implemented and applied to production data-parallel programs; our extensive evaluations on real programs and real data have shown its effectiveness on I/O reduction over the existing mechanisms at reasonable CPU cost, and its improvement on end-to-end performance in various network environments.

networked systems design and implementation | 2012