Chunyan Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chunyan Wang is active.

Explore More

Publication

Featured researches published by Chunyan Wang.

IEEE Transactions on Circuits and Systems | 2005

FPGA design and implementation of a low-power systolic array-based adaptive Viterbi decoder

Man Guo; M.O. Ahmad; M.N.S. Swamy; Chunyan Wang

In this paper, by modifying the well-known Viterbi algorithm, an adaptive Viterbi algorithm that is based on strongly connected trellis decoding is proposed. Using this algorithm, the design and a field-programmable gate array implementation of a low-power adaptive Viterbi decoder with a constraint length of 9 and a code rate of 1/2 is presented. In this design, a novel systolic array-based architecture with time multiplexing and arithmetic pipelining for implementing the proposed algorithm is used. It is shown that the proposed algorithm can reduce by up to 70% the average number of ACS computations over that by using the nonadaptive Viterbi algorithm, without degradation in the error performance. This results in lowering the switching activities of the logic cells, with a consequent reduction in the dynamic power. Further, it is shown that the total power consumption in the implementation of the proposed algorithm can be reduced by up to 43% compared to that in the implementation of the nonadaptive Viterbi algorithm, with a negligible increase in the hardware.

IEEE Transactions on Circuits and Systems | 2012

A Pipeline VLSI Architecture for Fast Computation of the 2-D Discrete Wavelet Transform

Chengjun Zhang; Chunyan Wang; M.O. Ahmad

In this paper, a scheme for the design of a high-speed pipeline VLSI architecture for the computation of the 2-D discrete wavelet transform (DWT) is proposed. The main focus in the development of the architecture is on providing a high operating frequency and a small number of clock cycles along with an efficient hardware utilization by maximizing the inter-stage and intra-stage computational parallelism for the pipeline. The inter-stage parallelism is enhanced by optimally mapping the computational task of multi decomposition levels to the stages of the pipeline and synchronizing their operations. The intra-stage parallelism is enhanced by dividing the 2-D filtering operation into four subtasks that can be performed independently in parallel and minimizing the delay of the critical path of bit-wise adder networks for performing the filtering operation. To validate the proposed scheme, a circuit is designed, simulated, and implemented in FPGA for the 2-D DWT computation. The results of the implementation show that the circuit is capable of operating with a maximum clock frequency of 134 MHz and processing 1022 frames of size 512 × 512 per second with this operating frequency. It is shown that the performance in terms of the processing speed of the architecture designed based on the proposed scheme is superior to those of the architectures designed using other existing schemes, and it has similar or lower hardware consumption.

IEEE Transactions on Circuits and Systems | 2010

A Pipeline VLSI Architecture for High-Speed Computation of the 1-D Discrete Wavelet Transform

Chengjun Zhang; Chunyan Wang; M.O. Ahmad

In this paper, a scheme for the design of a high-speed pipeline VLSI architecture for the computation of the 1-D discrete wavelet transform (DWT) is proposed. The main focus of the scheme is on reducing the number and period of clock cycles for the DWT computation with little or no overhead on the hardware resources by maximizing the inter- and intrastage parallelisms of the pipeline. The interstage parallelism is enhanced by optimally mapping the computational load associated with the various DWT decomposition levels to the stages of the pipeline and by synchronizing their operations. The intrastage parallelism is enhanced by decomposing the filtering operation equally into two subtasks that can be performed independently in parallel and by optimally organizing the bitwise operations for performing each subtask so that the delay of the critical data path from a partial-product bit to a bit of the output sample for the filtering operation is minimized. It is shown that an architecture designed based on the proposed scheme requires a smaller number of clock cycles compared to that of the architectures employing comparable hardware resources. In fact, the requirement on the hardware resources of the architecture designed by using the proposed scheme also gets improved due to a smaller number of registers that need to be employed. Based on the proposed scheme, a specific example of designing an architecture for the DWT computation is considered. In order to assess the feasibility and the efficiency of the proposed scheme, the architecture thus designed is simulated and implemented on a field-programmable gate-array board. It is seen that the simulation and implementation results conform to the stated goals of the proposed scheme, thus making the scheme a viable approach for designing a practical and realizable architecture for real-time DWT computation.

IEEE Journal of Solid-state Circuits | 2001

Design and implementation of a switched-current memory cell for low-power and weak-current operations

Chunyan Wang; M.O. Ahmad; M.N.S. Swamy

This paper describes the design and implementation of a new switched-current (SI) memory cell for current-mode signal processing. The SI memory cell operates in a pico-to-nanoampere range. To obtain an acceptable accuracy, a procedure to reduce the negative effects of the nonideal characteristics of MOS transistor in SI circuits is proposed and implemented. A prototype circuit including the new SI memory cell associated with optical sensors has been fabricated with a 0.35-/spl mu/m n-well technology. The test results show that, in a range of 0.5 pA to 15 nA, the error rate of current memorization/reproduction in the proposed SI memory is below 1% and the power dissipation is in a range of nanowatts or below.

international symposium on circuits and systems | 2005

A VLSI architecture for a high-speed computation of the 1D discrete wavelet transform

Chengjun Zhang; Chunyan Wang; M.O. Ahmad

An efficient VLSI architecture for the computation of the convolution-based discrete wavelet transform (DWT) is presented. The proposed architecture, employing two processing elements and a single buffer in a pipeline mode, enhances the processing time by appropriately decomposing the overall computations and distributing them equally between the two processing elements. The data flow, both within and between the processing elements, is streamlined, making use of the buffer and employing multiple input data paths within the processing elements. The parallelism of operations carried out by the computational blocks in each processing element is made more effective by equalizing the data paths used in these blocks. HSPICE and Verilog simulation results are presented to show that a circuit, whose design is based on the proposed architecture, is, in comparison with other existing architectures, fast and efficient for DWT computation, with a modest decrease in the area.

IEEE Transactions on Image Processing | 2012

An Edge-Adapting Laplacian Kernel For Nonlinear Diffusion Filters

M. R. Hajiaboli; M.O. Ahmad; Chunyan Wang

In this paper, first, a new Laplacian kernel is developed to integrate into it the anisotropic behavior to control the process of forward diffusion in horizontal and vertical directions. It is shown that, although the new kernel reduces the process of edge distortion, it nonetheless produces artifacts in the processed image. After examining the source of this problem, an analytical scheme is devised to obtain a spatially varying kernel that adapts itself to the diffusivity function. The proposed spatially varying Laplacian kernel is then used in various nonlinear diffusion filters starting from the classical Perona-Malik filter to the more recent ones. The effectiveness of the new kernel in terms of quantitative and qualitative measures is demonstrated by applying it to noisy images.

international symposium on circuits and systems | 2003

A low-power systolic array-based adaptive Viterbi decoder and its FPGA implementation

Man Guo; M. Omair Ahmad; M.N.S. Swamy; Chunyan Wang

In this paper, the design and FPGA implementation of a low-power adaptive Viterbi decoder with a constraint length of 9 and code rate of 1/2 is presented. In this design, a novel systolic array-based architecture with time multiplexing and arithmetic pipelining for implementing the adaptive Viterbi algorithm is used. A scheme for providing a tolerance to clock-to-data skew to avoid timing violation is proposed. A process of eliminating spurious toggles, for reducing power consumption, is also developed. It is shown that the total power consumption in the implementation of the adaptive algorithm can be reduced by up to 43% compared to that in the implementation of a corresponding non-adaptive Viterbi algorithm, with a negligible increase in the hardware.

international symposium on circuits and systems | 2007

An Adaptive Sleep Transistor Biasing Scheme for Low Leakage SRAM

Afshin Nourivand; Chunyan Wang; M. Omair Ahmad

Reducing the leakage power in embedded SRAM memories is critical for low-power applications. Raising the source voltage of SRAM cells in standby mode reduces the leakage currents effectively. However, in order to preserve the state of the cell in standby mode, source voltage cannot be raised beyond a certain level. The maximum source voltage of an SRAM cell is determined by its hold stability in a particular process corner. Hence, in order to achieve the maximum leakage reduction, the source voltage of each individual cell must be raised up to its maximum safe level. However, any cell-based technique realizing this would be practically not feasible. In this paper, we propose an SRAM leakage reduction technique, referred to as adaptive sleep transistor biasing, which automatically fine-tunes the source voltage of individual memory blocks to their optimum level. Thus, maximum leakage savings can be expected while data is safely retained during standby mode. Preliminary study shows that the proposed scheme has the potential of providing substantial saving in leakage power over those by using the conventional techniques.

international symposium on circuits and systems | 2007

CMOS Current-controlled Oscillators

Junhong Zhao; Chunyan Wang

The work presented in this paper is about the design of current-controlled oscillators (ICO). Two ICOs are proposed. Aiming at reducing the duration of the short-circuit currents caused by slowly-changing voltages in the circuits, signal conversion blocks are introduced to generate sharp pulses. In this way, the power efficiency of the circuits is improved, which leads to an extensive performance improvement of the circuits. Both ICOs can operate over a frequency range from 100 KHz to 900 MHz. The quality of the output waveforms before buffers is good and over the entire frequency range the rise/fall time is consistently short. The power dissipation of the ICOs is very low, the same as that of a 5-stage current-starving ring oscillator. Moreover, the scheme of the ICOs allows an easy adjustment of the duty cycle of the output pulse signals. A simple digital control structure of the duty cycle has also been proposed.

international symposium on circuits and systems | 1999

Design of a transistor-mismatch-insensitive switched-current memory cell

Chunyan Wang; M. Omair Ahmad; M.N.S. Swamy

In this paper, we present a new and efficient switched-current memory cell consisting of six MOS transistors. A charge re-adjusting procedure is implemented in the memory cell to ensure an acceptable processing accuracy. Functionally, the current memory is insensitive to transistor parameter mismatch. The cell is currently under fabrication using 1.5-/spl mu/m p-well single-poly technology with an area of about 50/spl times/25 /spl mu/m/sup 2/. Applications of the proposed memory cell for the design of a multi-purpose analog signal processing operator are also discussed.

Explore More