Jie Lei
Xidian University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jie Lei.
Iet Computers and Digital Techniques | 2014
Changhe Song; Yunsong Li; Jie Guo; Jie Lei
This study explores the use of the graphics processing units (GPUs) for performing the two-dimensional discrete wavelet transform (DWT) of images. The study of fast wavelet transforms has been driven both by the enormous volumes of data produced by modern cameras and by the need for real-time processing of these data. With the emergence of general computing on GPUs, many time-consuming applications have started to reap the associated benefits. In the implementation of a GPU-based DWT, two approaches are used according to the published works, which are the row-column (RC) approach and the block-based (BB) approach. Most state-of-the-art techniques are based on the RC approach, which utilises the parallelism between different rows and columns; few works are based on the BB approach, which explores the parallelism between different blocks of the image. Although easy to implement, resource usage of the RC approach is usually related to the image size. Another shortcoming of the RC approach lies in the fact, according to the authors analysis, that more global memory access is required. The authors thus select the BB approach in this study. Experiment results show that the proposed BB approach outperforms the RC approach, being 99× faster than a native CPU implementation for 4096 × 4096 images.
Proceedings of SPIE | 2008
Jie Lei; Yunsong Li; Fanqiang Kong; Chengke Wu
An innovative VLSI architecture for JPEG-LS compression algorithm is proposed, which implements real-time image compression either in near lossless mode or in lossless mode. The proposed architecture mainly includes four parallel pipelines, in which four pixels from four continuous lines could be processed simultaneously with a specific coding scan sequence, which ensures low complexity and real-time data processing. Our VLSI architecture is implemented on a Xilinx XC2VP30 FPGA. The experiment results show that our hardware system has the same results in image quality and compression rate as the standard JPEG-LS method and the processing speed of our system is four times more than that of traditional method.
international congress on image and signal processing | 2013
Jie Guo; Yunsong Li; Kai Liu; Jie Lei; Chengke Wu
In this paper, an efficient VLSI architecture of JPEG2000 encoder is given. The proposed architecture functionally consists of three main parts: discrete wavelet transform (DWT), block encoder (i.e., known as embedded block coding with optimized truncation (EBCOT), which is combined with bitplane coder, MQ coder and rate-distortion (RD) truncation) and memory management unit (MMU). For DWT, high-performance line-based lifting implementation supporting both 5/3 reversible and 9/7 irreversible filters is used to gain higher computational accuracy under lower hardware overhead constraints. For the block encoder, the bitplane parallel EBCOT architecture and efficient MQ coder scheme are adopted to increase parallelism and hardware utility. The hardware-oriented RD truncation is proposed to reduce processing time. MMU is employed to switch on-chip and off-chip memories in terms of image size for the consideration of power consumption. Experimental results demonstrate that the proposed efficient architecture attains a throughput of 120M Samples per second.
signal processing systems | 2012
Kai Liu; Jie Lei; YunSong Li
A bit-plane parallel architecture for a modified set partitioning in hierarchical trees (SPIHT) without lists algorithm, which uses breadth first search scheme, is proposed. The breadth first search scheme is suitable for very large scale integration (VLSI) implementation based on the analysis of SPIHT algorithm. The architecture has advantages of high parallelism, no intermediate buffer as a single tree is scanned. After field programmable gate arrays (FPGAs) synthesis and simulation, the throughput of the proposed architecture can reach 60 MSample/Sec. As the breadth first search scheme is very similar to that of SPIHT with lists, the quality of reconstructed images is almost the same with that of SPIHT with lists.
data compression communications and processing | 2012
Wen Wei; Jie Lei; Yunsong Li
A novel hardware implementation of JPEG-LS Encoder based on FPGA is introduced in this paper. Using a look-ahead technique, the critical delay paths of LOCO-I algorithm, such as feedback-loop circuit of parameters updating, are improved. Then an optimized architecture of JPEG-LS Encoder is proposed. Especially, run-mode encode process of JPEG-LS is covered in the architecture as well. Experiment results show that the circuit complexity and memory consumption of the proposed structure are much lower, while the data processing speed is much higher than some other available structures. So it is very suited for applying high-speed lossless compression of satellite sensing image onboard.
Journal of Circuits, Systems, and Computers | 2018
Jie Guo; Yunsong Li; Kai Liu; Jie Lei; Keyan Wang
The pixel purity index (PPI) algorithm is one of the most popular endmember extraction algorithms employed in hyperspectral image unmixing, which is too time-consuming to obtain real-time analysis in remote sensing applications. The fast field programmable gate array (FPGA) implementation for computing the PPI is proposed in this reported work. The parallel strategy by skewers consumes lower I/O bandwidth and on-chip memory capacity, and the Xilinx Vivado high-level-synthesis (HLS) tool speeds up our architecture design and implementation. The overall design can be simple to implement, and makes the FPGA hardware appealing for on-board hyperspectral unmixing.
data compression communications and processing | 2012
Kai Liu; Jin Zhang; Evgeny Belyaev; Yunsong Li; Jie Lei
We propose a zero block detection algorithm and architecture in EBCOT. After the detailed analysis of wavelet coefficients’ precision and distribution in JPEG2000, there are three main modes of zero coefficients in the wavelet domain, i.e. zero column, zero stripe and zero code block. And we also discover that the coding information of each bit plane and the corresponding passes can be obtained simultaneously in the hardware structure. Therefore, bit plane-parallel and pass-parallel coding with zero detection is proposed, and its VLSI architecture is shown in details. The analysis and the corresponding software/hardware experimental results show that the proposed architecture reduces the processing time greatly compared with others.
data compression communications and processing | 2012
Jie Guo; Yunsong Li; Kai Liu; Jie Lei; Chengke Wu
This paper describes a SEU fault injection framework. Based on the assumption of SEU effects and SEU distribution, the quantitative analysis between measured data and simulation model is investigated. By adjusting some parameters in the simulation-based framework, the proposed framework can be very possibly close to the published data and some accelerated radiation experiments. Furthermore, how the JPEG2000 based hardware architecture is sensitive to SEUs can be found out. In terms of hardware resources and operating frequencies, some fault-tolerant techniques can be introduced to the more sensitive parts, which show the frameworks effectiveness in fault-tolerant design for image compression applications.
Archive | 2012
Yunsong Li; Juan Song; Chengke Wu; Kai Liu; Jie Lei; Keyan Wang
Space missions are designed to leave Earth’s atmosphere and operate in outer space. Satellite imaging payloads operate mostly with a store-and-forward mechanism, in which captured images are stored on board and transmitted to ground later on. With the increase of spatial resolution, space missions are faced with the necessity of handling an extensive amount of imaging data. The increased volume of image data exerts great pressure on limited bandwidth and onboard storage. Image compression techniques provide a solution to the “bandwidth vs. data volume” dilemma of modern spacecraft. Therefore, compression is becoming a very important feature in the payload image processing units of many satellites [1].
international conference on wireless communications, networking and mobile computing | 2010
Kai Liu; Jie Lei; Yunsong Li
In this paper, we present a bit-plane parallel architecture for a modified SPIHT without lists algorithm using the breadth first search for coefficient trees. This scheme can ensure that the quality of reconstructed images is almost the same with that of SPIHT with lists. Compared with the other architectures, our architecture has advantages of high parallelism, no intermediate buffer and ability to error resilience as the coefficient trees are visited independently.