Xuli Shi
Shanghai University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xuli Shi.
IEEE Transactions on Multimedia | 2008
Liquan Shen; Zhi Liu; Zhaoyang Zhang; Xuli Shi
Variable size motion estimation with multiple reference frames has been adopted by the new video coding standard H.264. It can achieve significant coding efficiency compared to coding a macroblock (MB) in regular size with single reference frame. On the other hand, it causes high computational complexity of motion estimation at the encoder. Rate distortion optimized (RDO) decision is one powerful method to choose the best coding mode among all combinations of block sizes and reference frames, but it requires extremely high computation. In this paper, a fast inter mode decision is proposed to decide best prediction mode utilizing the spatial continuity of motion field, which is generated by motion vectors from 4times4 motion estimation. Motion continuity of each MB is decided based on the motion edge map detected by the Sobel operator. Based on the motion continuity of a MB, only a small number of block sizes are selected in motion estimation and RDO computation process. Simulation results show that our algorithm can save more than 50% computational complexity, with negligible loss of coding efficiency.
Journal of Visual Communication and Image Representation | 2009
Liquan Shen; Zhi Liu; Zhaoyang Zhang; Xuli Shi
Since the current rate control schemes in H.264 do not have the capability of efficient frame-level bit allocation, the video quality varies significantly from frame to frame especially for sequences with sudden scene changes or high motion activities. To overcome the limitation of frame-level bit allocation, we improve H.264 rate control scheme using two tools, the incremental proportional-integral-differential (PID) algorithm and the frame complexity estimation. The incremental PID algorithm is first introduced to control the buffer and reduce the influence of the buffer abrupt fluctuation in the process of frame-level bit allocation. To reduce more video quality variations, the frame target bit allocation is also adjusted by frame complexity that is estimated by residual energy. Simulation results show that the proposed rate control scheme, without introducing expensive computational complexity, decreases the average standard deviation of video quality by 32.29%.
international conference on multimedia and expo | 2007
Liquan Shen; Zhi Liu; Zhaoyang Zhang; Xuli Shi
The H.264 video coding standard adopts multiple reference frames for motion estimation. This new feature improves the prediction accuracy of inter-coding blocks significantly, but it results in a considerable increase in encoder complexity, mainly regarding to multi-frame selection and motion estimation. The reference software JM adopts the full search scheme, and the increased computation is in proportion to the number of searched reference frames. However, the reduction of prediction residues is highly dependent on the nature of sequences, not on the number of searched frames. In this paper, we propose an adaptive and fast multi-frame selection algorithm (AFMFS) based on motion vectors and SAD information coming from previous searches to adaptively terminate the procedure of multiple reference frames selection. Compared with the full search algorithm and the flexible multi-reference frame search criterion (FMRFSC), simulation results show that the proposed algorithm can save 55.28% and 40.23% computation cost on average, respectively, while it still maintains similar coding efficiency.
Optical Engineering | 2007
Xuli Shi; Zhaoyang Zhang; Liquan Shen; Yu Lu; Suxing Liu
This paper proposes a multiresolution method for video object segmentation in the compression domain. We first calculate global motion parameters using only background macroblocks with tiny residual dc coefficients of the P frame, and then get true motion vectors projected to the immediate adjoined I frame. The basic layer image is obtained with only dc coefficients of the I frame. The enhancement texture characteristics are provided by the ac coefficients for partial decoding. The true object motion vectors and the basic layer image are fed into a morphological motion filter to get the lowest-resolution regions of moving objects, called the layer4 region of interest (L4-ROI). Only some of the ac coefficients in L4-ROI are decoded to obtain a higher-resolution image, called layer3, that mainly consists of blocks of the moving object. The moving object of interest in the highest resolution is obtained from a morphological motion filter with L2-ROI and the true object motion vectors. The number of ac coefficients determines the resulting resolution. Experiments show that the new algorithm can extract multiresolution moving objects efficiently.
Signal Processing-image Communication | 2009
Liquan Shen; Zhi Liu; Zhaoyang Zhang; Xuli Shi
The new video coding standard, H.264 uses variable size motion estimation (VS-ME), multiple reference frame motion estimation (MRF-ME) and spatial-based intra prediction with selectable block size in inter frame coding. These tools have achieved significant coding efficiency compared to coding a macroblock (MB) only based on motion-compensation in regular size with single reference frame. However, these new features also give rise to an exhaustive computation in the coding procedure since there are so many combinations of coding modes and reference frames to be tried. In this paper, a fast motion estimation algorithm based on the selective VS-MRF-ME and intra prediction is proposed to reduce H.264 coding computational complexity. The basic idea of the method is to utilize the spatiotemporal property of motion field in predicting where VS-MRF-ME and intra prediction are needed, and only in these regions VS-MRF-ME and intra coding are enabled. The motion field is generated by motion vectors from 16x16 motion estimation on the nearest reference frame. Simulation results show that the proposed algorithm can save 50% computational complexity on average, with negligible loss of coding efficiency.
international conference on audio, language and image processing | 2008
Leirui Wang; Zhaoyang Zhang; Guowei Teng; Liquan Shen; Xuli Shi
Multimedia applications need larger and larger bandwidth. The only way to face the demands is to provide better and faster video compression standard. Thus, AVS is created in China. To address the need for hardware acceleration of its computationally intensive parts, high throughput hardware architectures for fast computation of the 2-D Transform Quantization Inverse Quantization and Inverse Transform are presented in this paper. In addition, two high performance system architectures are presented. The proposed hardware architectures are incorporated into two different hardware systems implemented on a Virtex 4 Pro FPGA. Simulation results show that both two hardware system architectures that are incorporated proposed architectures could provide satisfactory performances.
Optical Engineering | 2007
Liquan Shen; Zhi Liu; Zhaoyang Zhang; Xuli Shi
Since the current rate control schemes in H.264 do not present the capability of efficient frame-level bit allocation, the video quality varies greatly for sequences with scene cuts or large motions. To overcome that limitation, we propose a rate control scheme based on the incremental proportional-integral-differential (PID) algorithm. That algorithm is introduced to control the buffer and decrease the influence of buffer surface fluctuation on the process of frame-level bit allocation. Extensive simulation results show that this rate control scheme, without expensive computational complexity added, decreases the average video quality variations by 21.07%.
international conference on communication technology | 2008
Yu Lu; Zhaoyang Zhang; Zhi Liu; Xuli Shi
This paper proposed a novel approach to segment objects from the H.264 compressed video with moving background. At first, the noisy motion vectors are eliminated from the motion field by vector median filtering. Then the predicted motion field reconstructed by backward estimation is used to accumulate the motion field, which is followed by global motion compensation. After that, the hypothesis testing is used for initial region classification. Finally, the graph cuts technique is applied to partition objects by minimizing the energy function formulated by the model of Markov Random Field. The experimental results demonstrate efficient performance and good segmentation quality of the proposed method.
Proceedings of SPIE, the International Society for Optical Engineering | 2006
Xuli Shi; Zhi-cheng Jin; Guo-wei Teng; Zhao-yang Zhang; Ping An; Guang Xiao
It is a hot focus of current researches in video standards that how to transmit video streams over Internet and wireless networks. One of the key methods is FGS(Fine-Granular-Scalability), which can always adapt to the network bandwidth varying but with some sacrifice of coding efficiency, is supported by MPEG-4. Object-based video coding algorithm has been firstly included in MPEG-4 standard that can be applied in interactive video. However, the real time segmentation of VOP(video object plan) is difficult that limit the application of MPEG-4 standard in interactive video. H.264/AVC is the up-to-date video-coding standard, which enhance compression performance and provision a network-friendly video representation. In this paper, we proposed a new Object Based FGS(OBFGS) coding algorithm embedded in H.264/AVC that is different from that in mpeg-4. After the algorithms optimization for the H.264 encoder, the FGS first finish the base-layer coding. Then extract moving VOP using the base-layer information of motion vectors and DCT coefficients. Sparse motion vector field of p-frame composed of 4*4 blocks, 4*8 blocks and 8*4 blocks in base-layer is interpolated. The DCT coefficient of I-frame is calculated by using information of spatial intra-prediction. After forward projecting each p-frame vector to the immediate adjacent I-frame, the method extracts moving VOPs (video object plan) using a recursion 4*4 block classification process. Only the blocks that belong to the moving VOP in 4*4 block-level accuracy is coded to produce enhancement-layer stream. Experimental results show that our proposed system can obtain high interested VOP quality at the cost of fewer coding efficiency.
Archive | 2008
Guowei Teng; Xuli Shi; Leirui Wang; Jinyang Liu; Zhaoyang Zhang