Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yuxing Tang is active.

Publication


Featured researches published by Yuxing Tang.


embedded and ubiquitous computing | 2010

Phase Characterization and Classification for Micro-architecture Soft Error

Yu Cheng; Anguo Ma; Yuxing Tang; Minxuan Zhang

Transient faults have become a key challenge to modern processor design. Processor designers take Architectural Vulnerability Factor (AVF) as an estimation method of micro-architectures soft error rate. Dynamic, phase-based system reliability management, which tunes system hardware and software parameters at runtime for different phases, has become a focus in the field of processor design. Phase characterization technique (PCT) and phase classification algorithm (PCA) determine the accuracy of phase identification, which is the foundation of dynamic, phase-based system management. To our knowledge, this paper is the first to give a comprehensive evaluation and comparison of PCTs and PCAs for micro-architecture soft error. We first compare the efficiency of basic block vectors (BBV) and performance metric counters (PMC) based PCTs in reliability-oriented phase characterization on three micro-architectural structures (i.e. instruction queue, function unit and reorder buffer). Experimental results show that PMC based PCT performs better than BBV based PCT for most programs studied. Also, we compare the accuracy of three clustering algorithms (i.e. hierarchical clustering, k-means clustering and regression tree) in reliability-oriented phase classification. Regression tree method is demonstrated to improve the accuracy of classification by 30% compared with other two PCAs on average. Furthermore, based on the comparisons of PCTs and PCAs, we propose the optimal combination of PCT and PCA for soft error reliability-oriented phase identification—the combination of PMC and regression tree. In addition, we quantify the upper bound of predictability of AVF using BBV/PMC. Overall, an average of 82% AVF can be explained by PMC, while BBV can explain 78% AVF averagely.


advanced parallel programming technologies | 2009

Implementation of Rotation Invariant Multi-View Face Detection on FPGA

Jinbo Xu; Yong Dou; Yuxing Tang; Xiaodong Wang

This paper aims at detecting faces with all -/+90-degree rotation-out-of-plane and 360-degree rotation-in-plane pose changes fast and accurately under embedded hardware environment. We present a fine-classified method and a hardware architecture for rotation invariant multi-view face detection. A tree-structured detector hierarchy is designed to organize multiple detector nodes identifying pose ranges of faces. We propose a boosting algorithm for training the detector nodes. The strong classifier in each detector node is composed of multiple novelly-designed two-stage weak classifiers. Each detector node deals with the multi-dimensional binary classification problems by means of a shared output space of multi-components vector. The characteristics of the proposed method is analyzed for fully exploiting the spatial and temporal parallelism. We present the design of the hardware architecture in detail. Experiments on FPGA show that high accuracy and amazing speed are achieved compared with previous related works. The execution time speedups are significant when our FPGA design is compared with software solution on PC.


advanced parallel programming technologies | 2009

Performance Optimization Strategies of High Performance Computing on GPU

Anguo Ma; Jing Cai; Yu Cheng; Xiaoqiang Ni; Yuxing Tang; Zuocheng Xing

Recently GPU is widely utilized in scientific computing and engineering applications, owing primarily to the evolution of GPU architecture. Firstly, we analyze some key performance characters of GPU in detail, and the relationships among GPU architecture, programming model and memory hierarchy. Secondly, we present three performance optimization strategies: Prefetching, Streamlizing, and Task Division. Adequate experiments have been done to abstract the relationships among different factors and efficiency. Finally, we map the HPL benchmark to testify our strategies and achieve certain speedup.


advanced parallel programming technologies | 2005

RIMP: runtime implicit predication

Yuxing Tang; Kun Deng; Xiaodong Wang; Yong Dou; Xingming Zhou

If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions. In this paper, a trace-based predicate mechanism named RIMP (Runtime IMplicit Predication) is discussed. The candidates of if-conversion will be identified during dynamic execution. Conventional trace cache has been modified to store RIMP traces, which include instructions both from fall-through and target block following the conditional branch. Hardware extension will add predication to RIMP trace automatically. With the help of RIMP, legacy applications can benefit from predication mechanism without recompiling source code. Simulation of RIMP implementation under diverse microarchitecture configurations is presented in the paper. Results have shown promising performance improvement. In general, RIMP with 64kB trace storage delivers an average 10.3% IPC improvement while actually speeding up the execution time by over 7%.


Archive | 2011

Vector crossing multithread processing method and vector crossing multithread microprocessor

Xuejun Yang; Weixia Xu; Qiang Dou; Yongwen Wang; Gao Jun; Rangyu Deng; Xiaofei Yi; Yufeng Guo; Yuxing Tang; Zhijun Li; Junjie Wu; Kun Zeng; Xiaobo Yan


Journal of Central South University | 2012

SS-SERA: An improved framework for architectural level soft error reliability analysis

Yu Cheng; Anguo Ma; Yongwen Wang; Yuxing Tang; Minxuan Zhang


networked computing and advanced information management | 2011

Accurate vulnerability estimation for cache hierarchy

Yu Cheng; Anguo Ma; Yuxing Tang; Minxuan Zhang


Archive | 2007

Stream data-oriented resequencing access storage buffering method and device

Jiang Jiang; Minxuan Zhang; Zuocheng Xing; Xuejun Yang; Haiyan Chen; Gao Jun; Jinwen Li; Xiaofei Yi; Ming Zhang; Changfu Mu; Liu Yang; Xianjun Zeng; Chiyuan Ma; Yong Li; Xiaoqiang Ni; Yuxing Tang; Chengyi Zhang; Ming Tang


Archive | 2007

64 bit stream processor chip system structure oriented to scientific computing

Xuejun Yang; Minxuan Zhang; Zuocheng Xing; Jiang Jiang; Chiyuan Ma; Yong Li; Haiyan Chen; Gao Jun; Jinwen Li; Xiaofei Yi; Ming Zhang; Chengyi Zhang; Changfu Mu; Liu Yang; Xianjun Zeng; Xiaoqiang Ni; Yuxing Tang


Archive | 2007

Off chip DRAM data sampling method with configurable sample-taking point

Chiyuan Ma; Zhangfu Mu; Ming Zhang; Xuejun Yang; Minxuan Zhang; Zuocheng Xing; Jiang Jiang; Haiyan Chen; Gao Jun; Jinwen Li; Xiaofei Yi; Liu Yang; Xianjun Zeng; Yong Li; Xiaoqiang Ni; Yuxing Tang; Chengyi Zhang

Collaboration


Dive into the Yuxing Tang's collaboration.

Top Co-Authors

Avatar

Minxuan Zhang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaoqiang Ni

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xuejun Yang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Chiyuan Ma

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Liu Yang

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Anguo Ma

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Yu Cheng

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Kun Deng

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xingming Zhou

National University of Defense Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaodong Wang

National University of Defense Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge