Chenchen Deng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chenchen Deng is active.

Explore More

Publication

Featured researches published by Chenchen Deng.

custom integrated circuits conference | 2013

An energy-efficient coarse-grained dynamically reconfigurable fabric for multiple-standard video decoding applications

Leibo Liu; Chenchen Deng; Dong Wang; Min Zhu; Shouyi Yin; Peng Cao; Shaojun Wei

In this paper, we introduce a coarse-grained dynamically reconfigurable fabric, named Reconfigurable Processing Unit (RPU), which is implemented on a 5.4×3.1 mm2 silicon with TSMC 65 nm LP1P8M technology. This fabric consists of 16×16 multi-functional Processing Elements (PEs) interconnected by an area-efficient Line-Switched Mesh Connect (LSMC) routing. A Hierarchical Configuration Context (HCC) organization scheme is proposed to reduce the scale of the context memory and enhance configuration efficiency. Two reconfigurable processors are then designed and fabricated to verify the proposed techniques. One processor (called REMUS_HPP) integrates two RPUs, targeting the high performance applications. REMUS_HPP could decode 1920×1080@30fps H.264 streams with 280mW under 200MHz, achieving a performance gain of 1.81x and a 14.3x energy efficiency improvement over XPP-III. The other processor (called REMUS_LPP) integrates only one RPU, targeting the low power applications. REMUS_LPP could decode 720×480@35fps H.264 streams with 24.81mW under 75MHz, achieving a 76% power reduction and a 3.96x energy efficiency improvement compared with ADRES. More importantly, RPU is not only limited to video decoding applications. It can also be used to process some other computation-intensive applications and the corresponding analysis is given in this paper as well.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2015

An Efficient Application Mapping Approach for the Co-Optimization of Reliability, Energy, and Performance in Reconfigurable NoC Architectures

Chen Wu; Chenchen Deng; Leibo Liu; Jie Han; Jiqiang Chen; Shouyi Yin; Shaojun Wei

In this paper, an efficient application mapping approach is proposed for the co-optimization of reliability, communication energy, and performance (CoREP) in network-on-chip (NoC)-based reconfigurable architectures. A cost model for the CoREP is developed to evaluate the overall cost of a mapping. In this model, communication energy and latency (as a measure of performance) are first considered in energy latency product (ELP), and then ELP is co-optimized with reliability by a weight parameter that defines the optimization priority. Both transient and intermittent errors in NoC are modeled in CoREP. Based on CoREP, a mapping approach, referred to as priority and ratio oriented branch and bound (PRBB), is proposed to derive the best mapping by enumerating all the candidate mappings organized in a search tree. Two techniques, branch node priority recognition and partial cost ratio utilization, are adopted to improve the search efficiency. Experimental results show that the proposed approach achieves significant improvements in reliability, energy, and performance. Compared with the state-of-the-art methods in the same scope, the proposed approach has the following distinctive advantages: 1) CoREP is highly flexible to address various NoC topologies and routing algorithms while others are limited to some specific topologies and/or routing algorithms; 2) general quantitative evaluation for reliability, energy, and performance are made, respectively, before being integrated into unified cost model in general context while other similar models only touch upon two of them; and 3) CoREP-based PRBB attains a competitive processing speed, which is faster than other mapping approaches.

custom integrated circuits conference | 2013

SURFEX: A 57fps 1080P resolution 220mW silicon implementation for simplified speeded-up robust feature with 65nm process

Leibo Liu; Weilong Zhang; Chenchen Deng; Shouyi Yin; ShanShan Cai; Shaojun Wei

Speeded Up Robust Feature(SURF) is widely used in computer vision applications. In many recent applications like mobile devices and vision sensor network, it is extremely difficult to meet both the performance and power consumption requirements of SURF implementations, especially for CPU, GPU, DSP or FPGA based solutions. In this paper, the SURF algorithm is simplified and optimized for hardware implementation. To increase the throughput, procedures like orientation assignment and descriptor extraction are re-organized while maintaining enough accuracy; the memory accesses have also been improved to increase the bandwidth and reduce repeated data accesses; the workload of each stage in the pipeline is analyzed and balanced to reduce the pipeline bubble. Furthermore, a method called Word Length Reduction (WLR) is adopted to compress the integral image, which reduces the on-chip memory by 40%. In addition to that, the corresponding power consumptions are reduced significantly. The Simplified SURF is implemented onto a 3.4×4.0 mm2 chip called SURFEX using TSMC 65nm process. The chip is able to process 57 frames of 1080p(1920×1080) video per second with a 200MHz working frequency while dissipating 220mW. This throughput is 6 times of the ones reported in the latest literatures and the power consumption is less than half of the most outstanding implementations.

IEEE Transactions on Parallel and Distributed Systems | 2017

A Multi-Objective Model Oriented Mapping Approach for NoC-based Computing Systems

Chen Wu; Chenchen Deng; Leibo Liu; Jie Han; Jiqiang Chen; Shouyi Yin; Shaojun Wei

In this paper, a multi-objective, i.e., reliability, communication energy, performance, co-optimization model oriented mapping approach is proposed to find optimal mappings when applications are mapped onto network-on-chip (NoC) based reconfigurable architectures. A co-optimization model, defined as reliability efficiency model (REM), is developed to evaluate the overall reliability efficiency of a mapping. In REM, reliability efficiency is defined as the reliability profit at the same energy latency product. Based on REM, a mapping approach, referred to as priority and compensation factor oriented branch and bound (PCBB), is introduced to figure out the best mapping pattern. Two techniques, priority allocation and compensation factor utilization, are adopted to make a tradeoff between search efficiency and accuracy. Experimental results show that the proposed approach has three major contributions compared to state-of-the-art approaches. (1) PCBB is highly efficient in finding best mappings, with a 3x and 720x speedup compared to branch and bound (BB) and simulated annealing (SA). (2) PCBB is able to dynamically remap after the reconfiguration of the architecture. (3) General quantitative evaluation for reliability, communication energy and performance are made respectively before integrated into the unified model REM, whereas other similar models only touch upon two of them quantitatively.

IEEE Transactions on Information Forensics and Security | 2016

Against Double Fault Attacks: Injection Effort Model, Space and Time Randomization Based Countermeasures for Reconfigurable Array Architecture

Bo Wang; Leibo Liu; Chenchen Deng; Min Zhu; Shouyi Yin; Shaojun Wei

With the increasing accuracy of fault injections, it has become possible to inject two faults into specific circuit regions precisely at a certain time. Unfortunately, most existing fault attack countermeasures are based on the single fault assumption, and it is, therefore, very difficult to resist double fault attacks. Reconfigurable array architecture (RAA) has the ability to introduce spatial and time randomness by dynamic reconfiguration, which can alleviate the threat of double fault attacks. This paper, for the first time, analyzes the double fault attack issues in the fault injection phase systematically. An evaluation model, named injection effort model (IEM), is proposed to quantify the efforts of a successful fault injection. In IEM, the real injection process is described mathematically using the probability method, so that a theoretical basis can be provided for the corresponding countermeasure design. Based on the concept of spatial and time randomization, three countermeasures are implemented on RAA for the purpose of decreasing the implementation overhead under the premise of ensuring the security. When these countermeasures are adopted, tradeoffs can be made between the double fault resistance and the extra overhead through changing the degree of randomness. Experiments are carried out to analyze the relationship between the resistance and the overhead using Advanced Encryption Standard (AES), Data Encryption Standard (DES), and Camellia. When the overhead constraints in terms of throughput, hardware resources, and energy are 5%, 35%, and 10% respectively, the double fault resistance can increase by two to four orders of magnitude (ranging from 824 to 10 149 for different algorithms).

IEEE Transactions on Very Large Scale Integration Systems | 2015

A Flexible Energy- and Reliability-Aware Application Mapping for NoC-Based Reconfigurable Architectures

Leibo Liu; Chen Wu; Chenchen Deng; Shouyi Yin; Qinghua Wu; Jie Han; Shaojun Wei

This paper proposes a flexible energy- and reliability-aware application mapping approach for network-on-chip (NoC)-based reconfigurable architecture. A parameterized cost model is first developed by combining energy and reliability with a weight parameter that defines the optimization priority. Using this model, the overall mapping cost could be evaluated. Subsequently, a mapping method using branch and bound with a partial cost ratio is employed to find the best mapping by enumerating all the possible patterns organized in a search tree. To improve the search efficiency, nonoptimal mappings are discarded at early stages using the partial cost ratio. Using the proposed approach, applications can be mapped onto most NoC topologies and running with various routing algorithms when considering both energy and reliability. Other state-of-the-art works have also done substantial research for the same topic but only limited to a specific topology or routing algorithm. Even for the same topology and routing algorithm, the proposed approach still shows considerable advantages in many aspects. Experiments show that this approach gains not only significant reduction in energy but also improvement in reliability. It also outperforms other approaches in throughput and latency with competitive run time.

IEEE Transactions on Information Forensics and Security | 2017

Exploration of Benes Network in Cryptographic Processors: A Random Infection Countermeasure for Block Ciphers Against Fault Attacks

Bo Wang; Leibo Liu; Chenchen Deng; Min Zhu; Shouyi Yin; Zhuoquan Zhou; Shaojun Wei

Traditional detection countermeasures against fault attacks have been criticized as insecure because of the fragile comparison operation that can be maliciously bypassed. In order to avoid the comparison, infection countermeasures have been designed to confuse the faulty ciphertexts so that the output cannot be further explored. This paper presents an infection method that resists fault attacks using the existing Benes network module in high-performance crypto processors. The Benes network is originally used to accelerate permutation operations in block ciphers. The hamming weight of the differential results is balanced by modifying specific network switches, without changing the network topology. A further confusion is performed to destroy the determinacy by configuring part of the network with a random bit-stream. Furthermore, a statistical evaluation method is presented to quantitatively verify the proposed countermeasure in addition to a formal proof of security. This also provides a new concept for the evaluation of future random-enhanced infection methods. Experiments are carried out using Advanced Encryption Standard (AES), triple Data Encryption Standard (DES), and Camellia as examples. Under statistical evaluation, the results show that the proposed countermeasure improves the fault resistance by over four orders of magnitude compared with the unprotected case. Also, the performance and the area overhead are within 10% compared with the original Benes network.

asia and south pacific design automation conference | 2015

A novel approach using a minimum cost maximum flow algorithm for fault-tolerant topology reconfiguration in NoC architectures

Leibo Liu; Yu Ren; Chenchen Deng; Shouyi Yin; Shaojun Wei; Jie Han

An approach using a minimum cost maximum flow algorithm is proposed for fault-tolerant topology reconfiguration in a Network-on-Chip system. Topology reconfiguration is converted into a network flow problem by constructing a directed graph with capacity constraints. A cost factor is considered to differentiate between processing elements. This approach maximizes the use of spare cores to repair faulty systems, with minimal impact on area, throughput and delay. It also provides a transparent virtual topology to alleviate the burden for operating systems.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2017

PMCC: Fast and Accurate System-Level Power Modeling for Processors on Heterogeneous SoC

Chenchen Deng; Leibo Liu; Yang Liu; Shouyi Yin; Shaojun Wei

Accurate estimation of power at the system level is essential for system-on-chip (SoC) architects. The integration of heterogeneous processors like CPUs and emerging coarse-grained reconfigurable architectures (CGRAs) in SoCs significantly complicates the power-estimation process. This brief presents an accurate and efficient system-level power modeling framework, power modeling with a customized calibration, for processors on heterogeneous SoCs. Quantitative criteria are developed to classify the computing resources of heterogeneous SoCs, including instruction-driven processing architectures and CGRAs-based architectures, into two categories automatically. A novel power-modeling technique featuring a genetic algorithm and backpropagation neural network (GA-BPNN) is introduced to address CGRA-alike architectures, which cannot be properly handled by the traditional linear regression-based power calibration method. Experimental results show that the power estimation error for CGRAs using GA-BPNN is less than 5% with three orders faster speed compared with gate-level estimations. In the meanwhile, accuracy is improved on most benchmarks compared with the linear model. The average improvement in accuracy is 81% and ranges between 29% and 99%.

Science in China Series F: Information Sciences | 2016

A fast face detection architecture for auto-focus in smart-phones and digital cameras

Peng Ouyang; Shouyi Yin; Chenchen Deng; Leibo Liu; Shaojun Wei

Auto-focus is very important for capturing sharp human face centered images in digital and smart phone cameras. With the development of image sensor technology, these cameras support more and more highresolution images to be processed. Currently it is difficult to support fast auto-focus at low power consumption on high-resolution images. This work proposes an efficient architecture for an AdaBoost-based face-priority auto-focus. The architecture supports block-based integral image computation to improve the processing speed on high-resolution images; meanwhile, it is reconfigurable so that it enables the sub-window adaptive cascade classification, which greatly improves the processing speed and reduces power consumption. Experimental results show that 96% detection rate in average and 58 fps (frame per second) detection speed are achieved for the 1080p (1920×1080) images. Compared with the state-of-the-art work, the detection speed is greatly improved and power consumption is largely reduced.创新点1. 提出了并行的阵列化计算架构, 该架构支持包括高分辨率图上的基于块的积分处理, 从而实现并行计算, 可以加速人脸检测中积分计算过程。2. 提出了子窗口自适应的计算机制, 该机制可以在计算量和检测精度方面达到一个比较好的权衡。3. 提出了可重构的架构计算机制, 通过阵列之间互联模式重构, 阵列内部基本计算单元计算模式重构, 以及基本计算单元功能重构, 来支持子窗口自适应的分类计算, 有效减少计算量, 提高计算性能。

Explore More