Yiqun Zhu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yiqun Zhu is active.

Explore More

Publication

Featured researches published by Yiqun Zhu.

field-programmable technology | 2009

An architecture of optimised SIFT feature detection for an FPGA implementation of an image matcher

Lifan Yao; Hao Feng; Yiqun Zhu; Zhiguo Jiang; Danpei Zhao; Wenquan Feng

This paper has proposed an architecture of optimised SIFT (Scale Invariant Feature Transform) feature detection for an FPGA implementation of an image matcher. In order for SIFT based image matcher to be implemented on an FPGA efficiently, in terms of speed and hardware resource usage, the original SIFT algorithm has been significantly optimised in the following aspects: 1) Upsampling has been replaced with downsampling to save the interpolation operation. 2) Only four scales with two octaves are needed for our image matcher with moderate degradation of matching performance. 3) The total dimension of the feature descriptor has been reduced to 72 from 128 of the original SIFT, which leads to significantly simplify the image matching operation. With the optimisation above, the proposed FPGA implementation is able to detect the features of a typical image of 640×480 pixels within 31 milliseconds. Therefore, compared with the existing SIFT FPGA implementation, which requires 33 milliseconds for an image of 320×240 pixels, a significant improvement has been achieved for our proposed architecture.

Sensors | 2013

Laser doppler blood flow imaging using a CMOS imaging sensor with on-chip signal processing.

Diwei He; H. C. Nguyen; Barrie Hayes-Gill; Yiqun Zhu; John A. Crowe; Cally Gill; Geraldine F. Clough; Stephen P. Morgan

The first fully integrated 2D CMOS imaging sensor with on-chip signal processing for applications in laser Doppler blood flow (LDBF) imaging has been designed and tested. To obtain a space efficient design over 64 × 64 pixels means that standard processing electronics used off-chip cannot be implemented. Therefore the analog signal processing at each pixel is a tailored design for LDBF signals with balanced optimization for signal-to-noise ratio and silicon area. This custom made sensor offers key advantages over conventional sensors, viz. the analog signal processing at the pixel level carries out signal normalization; the AC amplification in combination with an anti-aliasing filter allows analog-to-digital conversion with a low number of bits; low resource implementation of the digital processor enables on-chip processing and the data bottleneck that exists between the detector and processing electronics has been overcome. The sensor demonstrates good agreement with simulation at each design stage. The measured optical performance of the sensor is demonstrated using modulated light signals and in vivo blood flow experiments. Images showing blood flow changes with arterial occlusion and an inflammatory response to a histamine skin-prick demonstrate that the sensor array is capable of detecting blood flow signals from tissue.

Optics Letters | 2012

64×64 pixel smart sensor array for laser Doppler blood flow imaging

Diwei He; H. C. Nguyen; Barrie Hayes-Gill; Yiqun Zhu; John A. Crowe; Geraldine F. Clough; C. A. Gill; Stephen P. Morgan

What is believed to be the first fully integrated two-dimensional complementary metal oxide semiconductor (CMOS) imaging array for laser Doppler blood flow imaging is demonstrated. The sensor has 64×64 pixels and includes both analog and digital on-chip processing electronics. This offers several potential advantages over commercial sensors as the processing is tailored to the signals of interest and the data bottleneck that exists between the sensor and processing electronics is overcome. To obtain a space efficient design over 64×64 pixels means that standard processing electronics used off-chip cannot be implemented. Images of both simulated blood flow responses and a blood flow occlusion test demonstrate the capability.

field programmable logic and applications | 2012

An efficient hardware architecture of the optimised SIFT descriptor generation

Wenjuan Deng; Yiqun Zhu; Hao Feng; Zhiguo Jiang

Scale Invariant Feature Transform (SIFT) algorithm has the potential of detecting a large number of features from images, which makes the feature descriptor generation become a bottleneck of the processing speed and hence degrade the overall performance of the algorithm. To tackle this problem, we propose an efficient hardware architecture based on the polar sampled descriptor in this paper. It takes only 7.57us to generate a feature descriptor of 72 dimensions with a system frequency of 100MHz, which is equivalent to approximately 132100 feature descriptors per second. It can generate feature descriptors for VGA (640×480 pixels) resolution video at 60 frames per second (fps), provided that there are no more than 2200 features per frame. As far as we know, our hardware architecture has the highest processing speed for descriptor generation, compared with other existing architectures.

application specific systems architectures and processors | 2003

Reconfigurable Viterbi decoding using a new ACS pipelining technique

Yiqun Zhu; Mohammed Benaissa

A novel reconfigurable Viterbi decoder is proposed, based on an area-efficient ACS architecture, in which the constraint length and traceback depth can be on-line reconfigurable to trade-off decoding capability and decoding speed. Key techniques of the decoder are 5-level ACS (add-compare-select) pipelining and in-place path metric updating, which result in very high decoding speed and low memory usage. To verify the performance of the decoder, an example design with constraint length 7 to 10, has been successfully implemented on Xilinx Virtex FPGA devices. FPGA implementation results, in terms of decoding speed, resource usages and BER, have been obtained. These confirmed the functionality and the expected higher speeds and lower resources.

EURASIP Journal on Advances in Signal Processing | 2003

A Novel High-Speed Configurable Viterbi Decoder for Broadband Access

Mohammed Benaissa; Yiqun Zhu

A novel design and implementation of an online reconfigurable Viterbi decoder is proposed, based on an area-efficient add-compare-select (ACS) architecture, in which the constraint length and traceback depth can be dynamically reconfigured. A design-space exploration to trade off decoding capability, area, and decoding speed has been performed, from which the maximum level of pipelining against the number of ACS units to be used has been determined while maintaining an in-place path metric updating. An example design with constraint lengths from 7 to 10 and a 5-level ACS pipelining has been successfully implemented on a Xilinx Virtex FPGA device. FPGA implementation results, in terms of decoding speed, resource usage, and BER, have been obtained using a tailored testbench. These confirmed the functionality and the expected higher speeds and lower resources.

Optics Letters | 2015

Multi-exposure laser speckle contrast imaging using a high frame rate CMOS sensor with a field programmable gate array

Shen Sun; Barrie Hayes-Gill; Diwei He; Yiqun Zhu; Stephen P. Morgan

A system has been developed in which multi-exposure laser speckle contrast imaging (LSCI) is implemented using a high frame rate CMOS imaging sensor chip. Processing is performed using a field programmable gate array (FPGA). The system allows different exposure times to be simulated by accumulating a number of short exposures. This has the advantage that the image acquisition time is limited by the maximum exposure time and that regulation of the illuminating light level is not required. This high frame rate camera has also been deployed to implement laser Doppler blood flow processing, enabling a direct comparison of multi-exposure laser speckle imaging and laser Doppler imaging (LDI) to be carried out using the same experimental data. Results from a rotating diffuser indicate that both multi-exposure LSCI and LDI provide a linear response to changes in velocity. This cannot be obtained using single-exposure LSCI, unless an appropriate model is used for correcting the response.

IEEE Transactions on Circuits and Systems | 2007

Reconfigurable Hardware Architectures for Sequential and Hybrid Decoding

Mohammed Benaissa; Yiqun Zhu

A novel reconfigurable sequential decoder architecture based on the Fano algorithm is presented in which the constraint length, the threshold spacing, and the time-out threshold are all run time reconfigurable. To maximize decoding performance, a maximum possible backward depth (of a whole frame) is performed. This is achieved by using shift registers combined with memory to store the information of an entire visited path. A field-programmable gate array) prototype of the decoder is built and actual hardware decoding performances in terms of decoding speeds, bit error rates (BERs), and buffer overflow rates, are obtained and comparisons made. To overcome the decoding delay that is inherent in sequential decoders, a hybrid scheme, including simple block codes and cyclic redundancy check is proposed to limit the number of backward search operations that the sequential decoder has to execute. As a result, a significant reduction in decoding delay and buffer overflow rate is achieved while maintaining comparative decoding performance in terms of BER

Medical Engineering & Physics | 2011

Low resource processing algorithms for laser Doppler blood flow imaging

H. C. Nguyen; Barrie Hayes-Gill; Yiqun Zhu; John A. Crowe; Diwei He; Stephen P. Morgan

The emergence of full field laser Doppler blood flow imaging systems based on CMOS camera technology means that a large amount of data from each pixel in the image needs to be processed rapidly and system resources need to be used efficiently. Conventional processing algorithms that are utilized in single point or scanning systems are therefore not an ideal solution as they will consume too much system resource. Two processing algorithms that address this problem are described and efficiently implemented in a field programmable gate array. The algorithms are simple enough to use low system resource but effective enough to produce accurate flow measurements. This enables the processing unit to be integrated entirely in an embedded system, such as in an application-specific integrated circuit. The first algorithm uses a short Fourier transformation length (typically 8) but averages the output multiple times (typically 128). The second method utilizes an infinite impulse response filter with a low number of filter coefficients that operates in the time domain and has a frequency-weighted response. The algorithms compare favorably with the reference standard 1024 point fast Fourier transform in terms of both resource usage and accuracy. The number of data words per pixel that need to be stored for the algorithms is 1024 for the reference standard, 8 for the short length Fourier transform algorithm and 5 for the algorithm based on the infinite impulse response filter. Compared to the reference standard the error in the flow calculation is 1.3% for the short length Fourier transform algorithm and 0.7% for the algorithm based on the infinite impulse response filter.

international symposium on circuits and systems | 2003

A novel ACS scheme for area-efficient Viterbi decoders

Yiqun Zhu; Mohammed Benaissa

This paper presents a novel ACS (add-compare-select) scheme that enables high-speeds to be achieved in area-efficient Viterbi decoders without compromising area and power efficiency. This is achieved by introducing multi-level pipelining into the ACS feedback loop. As a proof of concept, a constraint-7 Viterbi decoder using 8 ACS units has been designed with 5 pipeline levels. This design has been implemented successfully on an FPGA device. The results obtained confirm functionality, speed improvements and the expected low resource usage. To quantify these, a state-parallel Viterbi decoder design has also been implemented on the same FPGA device and performance comparisons made.

Explore More