Is this you? Create Your Porfile

Huizhan Yi

National University of Defense Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Huizhan Yi is active.

Explore More

Publication

Featured researches published by Huizhan Yi.

Journal of Computer Science and Technology | 2011

Optimizing linpack benchmark on GPU-accelerated petascale supercomputer

Feng Wang; Canqun Yang; Yunfei Du; Juan Chen; Huizhan Yi; Weixia Xu

In this paper we present the programming of the Linpack benchmark on TianHe-1 system, the first petascale supercomputer system of China, and the largest GPU-accelerated heterogeneous system ever attempted before. A hybrid programming model consisting of MPI, OpenMP and streaming computing is described to explore the task parallel, thread parallel and data parallel of the Linpack. We explain how we optimized the load distribution across the CPUs and GPUs using the two-level adaptive method and describe the implementation in details. To overcome the low-bandwidth between the CPU and GPU communication, we present a software pipelining technique to hide the communication overhead. Combined with other traditional optimizations, the Linpack we developed achieved 196:7 GFLOPS on a single compute element of TianHe-1. This result is 70:1% of the peak compute capability, 3:3 times faster than the result by using the vendors library. On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0:563 PFLOPS, which made TianHe-1 the 5th fastest supercomputer on the Top500 list in November, 2009.

Journal of Computer Science and Technology | 2014

OpenMC: Towards Simplifying Programming for TianHe Supercomputers

XiangKe Liao; Can-Qun Yung; Tao Tang; Huizhan Yi; Feng Wang; Qiang Wu; Jingling Xue

Modern petascale and future exascale systems are massively heterogeneous architectures. Developing productive intra-node programming models is crucial toward addressing their programming challenge. We introduce a directive-based intra-node programming model, OpenMC, and show that this new model can achieve ease of programming, high performance, and the degree of portability desired for heterogeneous nodes, especially those in TianHe supercomputers. While existing models are geared towards offloading computations to accelerators (typically one), OpenMC aims to more uniformly and adequately exploit the potential offered by multiple CPUs and accelerators in a compute node. OpenMC achieves this by providing a unified abstraction of hardware resources as workers and facilitating the exploitation of asynchronous task parallelism on the workers. We present an overview of OpenMC, a prototyping implementation, and results from some initial comparisons with OpenMP and hand-written code in developing six applications on two types of nodes from TianHe supercomputers.

Journal of Computer Science and Technology | 2006

Toward the optimal configuration of dynamic voltage scaling points in real-time applications

Huizhan Yi; Xuejun Yang

In real-time applications, compiler-directed dynamic voltage scaling (DVS) could reduce energy consumption efficiently, where compiler put voltage scaling points in the proper places, and the supply voltage and clock frequency were adjusted to the relationship between the reduced time and the reduced workload. This paper presents the optimal configuration of dynamic voltage scaling points without voltage scaling overhead, which minimizes energy consumption. The conclusion is proved theoretically. Finally, it is confirmed by simulations with equally-spaced voltage scaling configuration.

Archive | 2012

Method for partitioning dynamic tasks of CPU and GPU based on load balance

Juan Chen; Yunfei Du; Chun Huang; Xiangke Liao; Feng Wang; Canqun Yang; Huizhan Yi; Kejia Zhao

Archive | 2010

Quickening method utilizing cooperative work of CPU and GPU to solve triangular linear equation set

Juan Chen; Yunfei Du; Chun Huang; Xiangke Liao; Jie Liu; Feng Wang; Canqun Yang; Huizhan Yi

Archive | 2010

Full-covered automatic generating method of test case package of microprocessor

Feng Wang; Canqun Yang; Huizhan Yi; Juan Chen; Chun Huang; Kejia Zhao; Yunfei Du

Archive | 2012

Method for automatically testing energy consumption of computer application program interval

Huizhan Yi; Kejia Zhao; Canqun Yang; Chun Huang; Juan Chen; Feng Wang; Yunfei Du; Chunjiang Li

Archive | 2012

Shared memory based method for realizing multiprocess GPU (Graphics Processing Unit) sharing

Yunfei Du; Canqun Yang; Huizhan Yi; Feng Wang; Chun Huang; Kejia Zhao; Juan Chen; Chunjiang Li; Ke Zuo; Lin Peng

Archive | 2011

Multi-core system fault tolerance method based on memory caching technology

Huizhan Yi; Chunjiang Li; Chun Huang; Canqun Yang; Kejia Zhao; Yunfei Du; Lin Peng; Juan Chen; Feng Wang; Ke Zuo

Archive | 2011

Method for automatically generating double-precision SIMD component chip-level verification test stimulus

Chunjiang Li; Huizhan Yi; Kejia Zhao; Canqun Yang; Chun Huang; Feng Wang; Yunfei Du; Juan Chen; Lin Peng

Explore More