Hongxu Jiang
Beihang University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hongxu Jiang.
IEEE Geoscience and Remote Sensing Letters | 2014
Yongfei Zhang; Haiheng Cao; Hongxu Jiang; Bo Li
As remote sensing images are often characterized with strong randomness, weak local correlation, and multiple small targets, the commonly used coarse-granularity subband-level quantization scheme fails to make use of these characteristics; thus, the performance improvements of these methods in literature are often marginal. To address this problem, this letter presents a novel spatially adaptive quantization (SAQ) method for the compression of remote sensing images based on our proposed Visual Distortion Sensitivity (ViDiS) Model. The ViDiS model takes into consideration four ViDiS components, including image luminance, spatial frequency, spatial orientation, and visual masking, to help measure the distortion more consistent to the image quality perceived by human beings. Then, a SAQ scheme is proposed to better exploit the content characteristics of remote sensing images, in which the quantization is conducted on a finer subband block level rather than subband level, with the guidance of the ViDiS model. Experimental results show that the proposed algorithm can preserve better visual quality in low-contrast areas with small targets at a competitive computational cost, which makes it more desirable in compression applications for remote sensing images.
Multimedia Tools and Applications | 2016
Shiming Sun; Hongxu Jiang; Bo Li
As one of the most time-consuming parts of video coding, Motion Estimation (ME) has always been the major issue in the embedded coding system due to its memory-intensive nature. This is even truer now as the gap between processor and memory speed continues to grow in the embedded coding system on multi-core processors. In this paper, a data prefetching algorithm based on a Markov Chain Model (MCMDP) is presented to improve the data access efficiency for the ME of High Efficiency Video Coding (HEVC) on multi-core DSPs. First, by analyzing the process and features of ME, a new method of calculating Motion Vector Predictions (MVPs) is given, in which the coding block’s MVP is estimated from the MVPs of the reference picture instead of the motion vectors of the neighboring blocks. This is critical to improve the efficiency of data prefetching for ME because it eliminates the data dependencies that cause the latency of data prefetching. Second, the experimental results show that the probability distribution of the search windows in ME has continuity and locality in successive pictures, and these statistical properties are consistent with the characteristics of Markov chains. Therefore, a new model based on Markov chains is designed for predicting the prefetch window that covers the search window to improve the coverage of data prefetching. Finally, the experiments on TMS320C6678 demonstrate that the prefetching efficiency is significantly improved for the ME of HEVC on multi-core DSPs.
Journal of Applied Remote Sensing | 2016
Yongfei Zhang; Haiheng Cao; Hongxu Jiang; Bo Li
Abstract. As remote sensing image applications are often characterized with limited bandwidth and high-quality demands, higher coding performance of remote sensing images are desirable. The embedded block coding with optimal truncation (EBCOT) is the fundamental part of JPEG2000 image compression standard. However, EBCOT only considers correlation within a sub-band and utilizes a context template of eight spatially neighboring coefficients in prediction. The existing optimization methods in literature using the current context template prove little performance improvements. To address this problem, this paper presents a new mutual information (MI)-based context template selection and modeling method. By further considering the correlation across the sub-bands, the potential prediction coefficients, including neighbors, far neighbors, parent and parent neighbors, are comprehensively examined and selected in such a manner that achieves a nice trade-off between the MI-based correlation criterion and the prediction complexity. Based on the selected context template, a high-order prediction model, which jointly considers the weight and the significance state of each coefficient, is proposed. Experimental results show that the proposed algorithm consistently outperforms the benchmark JPEG2000 standard and state-of-the-art algorithms in term of coding efficiency at a competitive computational cost, which makes it desirable in real-time compression applications, especially for remote sensing images.
international congress on image and signal processing | 2010
Xiaonan Ji; Hongxu Jiang; Chaosheng Xiao; Yuanpeng Wang
In video encoding hardware system based on camera link interface, in order to coordinate data transmission between different clock domains, we designed a DSP and FPGA interconnection scheme, in which a software FIFO was used. This article introduced relevant interface interconnects and timing analysis. The system we are using proved the method introduced in the article is stable and efficient to guarantee the speed and correctness of data transmission.
Journal of Visual Communication and Image Representation | 2016
Yongfei Zhang; Haiheng Cao; Hongxu Jiang; Bo Li
A multi-level DWT architecture with lowest hardware and highest speed is presented.Dual scanning is introduced to improve the row transform and the hardware utilization.RTU/CTU takes advantage of input availabilities and occurs in parallel with small latency.Nice parallel multi-level architecture leads to lowest memory and computation cost.It outperforms comparable schemes and is suitable for memory-constrained applications. Memory requirements and critical path are essential for 2-D Discrete Wavelet Transform (DWT). In this paper, we address this problem and develop a memory-efficient high-speed architecture for multi-level two-dimensional DWT. First, dual data scanning technique is first adopted in 2-D 9/7 DWT processing unit to perform lifting operations, which doubles the throughputs per cycle. Second, for 2-D DWT architecture, the proposed Row Transform Unit and Column Transform Unit take advantage of input sample availabilities and provision computing resources accordingly to optimize the processing speed, in which the number of processors is further optimized to significantly reduce the hardware cost. Third, to address the problem of high cost of memory for the immediate computing results from each level and the computation time as resolution level increases, multiple proposed 2-D DWT units were combined to build a parallel multi-level architecture, which can perform up to six levels of 2-D DWT in a resolution level parallel way on any arbitrary image size at competitive hardware cost. Experimental results demonstrated that the proposed scheme achieves improved hardware performance with significantly reduced on-chip memory resource and computational time, which outperforms the-state-of-the-art schemes and makes it desirable in memory-constrained real-time application systems.
international congress on image and signal processing | 2015
Shiming Sun; Hongxu Jiang; Tingshan Liu; Bo Li
As the latest video coding standard, HEVC dramatically reduced the bit-rate compared with H.264/AVC. However, the complexity has strongly increased. In order to accelerate the video coding, it is a trend in recent years that video coding is implemented in parallel based on the multi-core processors, especially in the embedded system. In this paper, a scalable multi-granularity encoder is proposed for the embedded system. In the encoder, firstly, the parallel granularities in HEVC are analyzed, and the tiles, CTUs(Code Tree Units) and pixels are mapped reasonably to the processors, cores and the computing units to fully utilize the advantages of them. Secondly, the data exchange and synchronization messages in the parallel granularities are analyzed, and are to be efficiently dispatched to the local/shared memory or the high-speed communication interfaces to increase the process of the encoding. Therefore, the video encoding is processed efficiently. The experiment shows that the speedup is about 10.8 on the 14 cores on average and it is an approximately linearity with the processors. The videos that resolution is equal to or lower than 720P can be encoded in real-time on the hardware platform composed with two TMS320C6678s.
international conference on image processing | 2014
Haiheng Cao; Yongfei Zhang; Hongxu Jiang
MQ is an efficient entropy coder that performs the actual compression in JPEG2000. However, it usually acts as the bottleneck of the hardware architecture due to the feedback loops caused by iterative operations. The current single ejection (SE) architecture achieves higher frequency by adopting more pipeline stages, but the speed is limited by the throughput per cycle. On the other hand, the multiple ejections (ME) usually handles more than one context data (CxD) per cycle, while the frequency is deteriorated due to the longer critical circuit caused by the context dependences. Hence, to enable the MQ arithmetic coder to process more than one sample while running at higher clock frequency, this paper proposes a two-CxDs architecture based on the equality of two adjacent CxDs, in which the two adjacent CxDs with different contexts are processed in a clock. Experiment results illustrate the architecture increases above 30% speed performance.
international congress on image and signal processing | 2013
Haiheng Cao; Yongfei Zhang; Hongxu Jiang
As the visual distortion sensitivity based spatially adaptive quantization (VDSSAQ) algorithm considers human visual system (HVS) and tunes the quantizers steps in a finer manner to improve the perceptual quality, it usually causes considerable computing complexity and memory access overhead. To address this problem, this paper presents a new and efficient very large scale integration (VLSI) architecture for the implementation of VDSSAQ. The proposed architecture exploits the parallelism between wavelet transform and quantization as well as quantization algorithm itself to speed up the computing process. Besides, a delaying quantization operation scheme is designed to work with the bitplane coder (BPC) to further reduce the time consumption and memory accesses significantly. Experimental results show that the proposed VLSI architecture outperforms the state-of-the-art architectures with the least memory accesses and highest overall throughput, which makes it desirable in real time image compression applications.
international congress on image and signal processing | 2010
Feng He; Hongxu Jiang; Chaosheng Xiao; Yuanpeng Wang
In a multi-FPGAs based remote sensing image realtime compression system, high-speed circuit design is carried out and a variety of signal integrity problems caused by system impedance discontinuity are analyzed. On the foundation of transmission line impedance control, taking the high-speed bus of DDR2 SDRAM as an example, signal reflection problem brought by impedance mismatch of the signal chain has been solved through HyperLynx software simulation and analysis. In this system, signal integrity problem caused by system impedance mismatch is effectively solved and reliability of the system is obviously improved, thus it achieves the expected goal of hardware design.
Archive | 2010
Bo Li; Haiheng Cao; Yi-hong Wen; Hongxu Jiang