Peng Ouyang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Peng Ouyang is active.

Explore More

Publication

Featured researches published by Peng Ouyang.

Sensors | 2015

Fast Traffic Sign Recognition with a Rotation Invariant Binary Pattern Based Feature

Shouyi Yin; Peng Ouyang; Leibo Liu; Yike Guo; Shaojun Wei

Robust and fast traffic sign recognition is very important but difficult for safe driving assistance systems. This study addresses fast and robust traffic sign recognition to enhance driving safety. The proposed method includes three stages. First, a typical Hough transformation is adopted to implement coarse-grained location of the candidate regions of traffic signs. Second, a RIBP (Rotation Invariant Binary Pattern) based feature in the affine and Gaussian space is proposed to reduce the time of traffic sign detection and achieve robust traffic sign detection in terms of scale, rotation, and illumination. Third, the techniques of ANN (Artificial Neutral Network) based feature dimension reduction and classification are designed to reduce the traffic sign recognition time. Compared with the current work, the experimental results in the public datasets show that this work achieves robustness in traffic sign recognition with comparable recognition accuracy and faster processing speed, including training speed and recognition speed.

design automation conference | 2015

Efficient memory partitioning for parallel data access in multidimensional arrays

Chenyue Meng; Shouyi Yin; Peng Ouyang; Leibo Liu; Shaojun Wei

Memory bandwidth bottlenecks severely restrict parallel access of data from memory arrays. To increase bandwidth, memory partitioning algorithms have been proposed to access multiple memory banks simultaneously. However, previous partitioning schemes propose complex partitioning algorithms, which leads to non-optimal memory bank space utilization and unnecessary storage overhead. In this paper, we develop an efficient memory partitioning strategy with low time complexity and low storage overhead for data access in multidimensional arrays. Experimental results show that our memory partitioning algorithm saves up to 93.7% in the amount of arithmetic operations, 96.9% in execution time and 31.1% in storage overhead, compared to the state-of-the-art approach.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2015

A Fast Integral Image Computing Hardware Architecture With High Power and Area Efficiency

Peng Ouyang; Shouyi Yin; Yuchi Zhang; Leibo Liu; Shaojun Wei

Integral image computing is an important part of many vision applications and is characterized by intensive computation and frequent memory accessing. This brief proposes an approach for fast integral image computing with high area and power efficiency. For the data flow of the integral image computation a dual-direction data-oriented integral image computing mechanism is proposed to improve the processing efficiency, and then a pipelined parallel architecture is designed to support this mechanism. The parallelism and time complexity of the approach are analyzed and the hardware implementation cost of the proposed architecture is also presented. Compared with the state-of-the-art methods this architecture achieves the highest processing speed with comparatively low logic resources and power consumption.

Sensors | 2014

A Multi-Modal Face Recognition Method Using Complete Local Derivative Patterns and Depth Maps

Shouyi Yin; Xu Dai; Peng Ouyang; Leibo Liu; Shaojun Wei

In this paper, we propose a multi-modal 2D + 3D face recognition method for a smart city application based on a Wireless Sensor Network (WSN) and various kinds of sensors. Depth maps are exploited for the 3D face representation. As for feature extraction, we propose a new feature called Complete Local Derivative Pattern (CLDP). It adopts the idea of layering and has four layers. In the whole system, we apply CLDP separately on Gabor features extracted from a 2D image and depth map. Then, we obtain two features: CLDP-Gabor and CLDP-Depth. The two features weighted by the corresponding coefficients are combined together in the decision level to compute the total classification distance. At last, the probe face is assigned the identity with the smallest classification distance. Extensive experiments are conducted on three different databases. The results demonstrate the robustness and superiority of the new approach. The experimental results also prove that the proposed multi-modal 2D + 3D method is superior to other multi-modal ones and CLDP performs better than other Local Binary Pattern (LBP) based features.

IEEE Transactions on Very Large Scale Integration Systems | 2017

Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns

Fengbin Tu; Shouyi Yin; Peng Ouyang; Shibin Tang; Leibo Liu; Shaojun Wei

Deep convolutional neural networks (DCNNs) have been successfully used in many computer vision tasks. Previous works on DCNN acceleration usually use a fixed computation pattern for diverse DCNN models, leading to imbalance between power efficiency and performance. We solve this problem by designing a DCNN acceleration architecture called deep neural architecture (DNA), with reconfigurable computation patterns for different models. The computation pattern comprises a data reuse pattern and a convolution mapping method. For massive and different layer sizes, DNA reconfigures its data paths to support a hybrid data reuse pattern, which reduces total energy consumption by 5.9~8.4 times over conventional methods. For various convolution parameters, DNA reconfigures its computing resources to support a highly scalable convolution mapping method, which obtains 93% computing resource utilization on modern DCNNs. Finally, a layer-based scheduling framework is proposed to balance DNA’s power efficiency and performance for different DCNNs. DNA is implemented in the area of 16 mm2 at 65 nm. On the benchmarks, it achieves 194.4 GOPS at 200 MHz and consumes only 479 mW. The system-level power efficiency is 152.9 GOPS/W (considering DRAM access power), which outperforms the state-of-the-art designs by one to two orders.

workshop on applications of computer vision | 2015

A Multi-modal 2D + 3D Face Recognition Method with a Novel Local Feature Descriptor

Xu Dai; Shouyi Yin; Peng Ouyang; Leibo Liu; Shaojun Wei

The research on depth map is becoming a focus of image understanding and computer vision. In this paper, depth map is introduced to enhance the performance of face recognition and a novel multi-modal 2D + 3D method is proposed. First of all, we propose a new local feature descriptor called Enhanced Local Mixed Derivative Pattern (ELMDP). Then, this feature descriptor is applied on the 2D intensity image and the depth map respectively. At last the two parts of extracted feature are combined together, multiplied by corresponding confidence weights. Experiments are conducted on 3 sub-databases of Curtin Faces database which contains variations in illumination, expression, pose and disguise. Our proposed method outmatches the other methods on recognition rate and the Receiver Operating Characteristic (ROC) curve is much gentler. All the results demonstrate that the proposed method is quite outstanding and robust.

IEEE Systems Journal | 2017

An AdaBoost-Based Face Detection System Using Parallel Configurable Architecture With Optimized Computation

Shouyi Yin; Peng Ouyang; Xu Dai; Leibo Liu; Shaojun Wei

With the development of image sensor technology, the AdaBoost-based face detections are widely used in many monitoring sensor networks and mobile-camera-based applications. Fast face detection with high detection accuracy and low power consumption in such kinds of applications is very important. Since the AdaBoost-based face detection exhibits characteristics of data computation in dual direction and data diversity, we propose an AdaBoost-based face detection system using parallel configurable architecture with optimized computation. The architecture consists of parallel configurable arrays and two-level shared memory systems. It achieves dual-direction-based integral image computation that improves parallel processing efficiency and enables the subwindow adaptive cascade classification for data diversity, which further improves the detection efficiency in diverse face detection. Compared with the state-of-the-art works, this work achieves maximal performance of 30 frames/s at 1080p detection speed and extreme low power consumption.

international symposium on circuits and systems | 2015

Neural approximating architecture targeting multiple application domains

Fengbin Tu; Shouyi Yin; Peng Ouyang; Leibo Liu; Shaojun Wei

Approximate computing emerges as a promising technique for high energy efficiency. Multi-layer perceptron (MLP) models can be used to approximate many modern applications, with little quality loss. However, the various MLP topologies limits the hardwares performance in all cases. In this paper, a scheduling framework is proposed to guide mapping MLPs onto limited hardware resources with high performance. We then design a reconfigurable neural architecture (RNA) to support the proposed scheduling framework. RNA can be reconfigured to accelerate different MLP topologies, and achieves higher performance than other MLP accelerators.

international conference on consumer electronics | 2015

Efficient lane detection system based on monocular camera

Tao Tan; Shouyi Yin; Peng Ouyang; Leibo Liu; Shaojun Wei

Recent years, lane detection has become of high interest in the area of intelligent vehicles and it provides the fundamental information which can be applied to the further development of Driving Assistance System. In this paper, we propose a lane detection system based on iterative searching and Random Sample Consensus (RANSAC) curve fitting. Experimental results show the effectiveness of our approach which detects lanes at a high correct rate and can be applied in intelligent vehicles.

robotics and biomimetics | 2014

A fast and robust traffic sign recognition method using ring of RIBP histograms based feature

Shouyi Yin; Peng Ouyang; Leibo Liu; Shaojun Wei

Fast and robust traffic sign recognition is very important but difficult for the safety driving assist systems. This study addresses the fast and robust traffic sign recognition to enhance safety driving. We first adopt the typical Hough transform methods to implement coarse-grained locating of the candidate regions (shapes of rectangle, triangle and circle, etc.) of the traffic signs; and then propose a ring of RIBP (Rotation Invariant Binary Pattern) histograms based feature in Gaussian space to reduce the traffic sign detection time and achieve the robustness on traffic sign detection in terms of scale, rotation, and illumination; Finally, the learning based techniques are used to reduce the feature dimension and implement the classification, which greatly reduce the processing time of traffic sign recognition. Experiments on the GTSRB dataset show that this work achieves 98.62% recognition accuracy and average 0.005 second per image recognition time, which exhibit the comparable recognition accuracy and higher recognition speed comparing to the state-of-the-art works.

Explore More