Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Siwei Ma is active.

Publication


Featured researches published by Siwei Ma.


IEEE Transactions on Circuits and Systems for Video Technology | 2005

Rate-distortion analysis for H.264/AVC video coding and its application to rate control

Siwei Ma; Wen Gao; Yan Lu

In this paper, an efficient rate-control scheme for H.264/AVC video encoding is proposed. The redesign of the quantization scheme in H.264/AVC results in that the relationship between the quantization parameter and the true quantization stepsize is no longer linear. Based on this observation, we propose a new rate-distortion (R-D) model by utilizing the true quantization stepsize and then develop an improved rate-control scheme for the H.264/AVC encoder based on this new R-D model. In general, the current R-D optimization (RDO) mode-selection scheme in H.264/AVC test model is difficult for rate control, because rate control usually requires a predetermined set of motion vectors and coding modes to select the quantization parameter, whereas the RDO does in the different order and requires a predetermined quantization parameter to select motion vectors and coding modes. To tackle this problem, we develop a complexity-adjustable rate-control scheme based on the proposed R-D model. Briefly, the proposed scheme is a one-pass process at frame level and a partial two-pass process at macroblock level. Since the number of macroblocks with the two-pass processing can be controlled by an encoder parameter, the fully one-pass implementation is a subset of the proposed algorithm. An additional topic discussed in this paper is about video buffering. Since a hypothetical reference decoder (HRD) has been defined in H.264/AVC to guarantee that the buffers never overflow or underflow, the more accurate rate-allocation schemes are proposed to satisfy these requirements of HRD.


visual communications and image processing | 2011

Fast mode decision algorithm for intra prediction in HEVC

Liang Zhao; Li Zhang; Siwei Ma; Debin Zhao

High Efficiency Video Coding (HEVC) is an ongoing video compression standard and a successor to H.264/AVC. It aims to provide significantly improved compression performance as compared to all existing video coding standards. In the intra prediction of HEVC, this is being achieved by providing up to 35 intra modes with larger coding unit. The optimal mode is selected by a rough mode decision (RMD) process from all of the available modes first, and then through the rate-distortion optimization (RDO) process for the final decision. Because the fact that every coding unit with different sizes is traversed in both procedures makes it very time-consuming, a gradient based fast mode decision algorithm is proposed in this paper to reduce the computational complexity of HEVC. Prior to intra prediction, gradient directions are calculated and a gradient-mode histogram is generated for each coding unit. Based on the distribution of the histogram, only a small part of the candidate modes are chosen for the RMD and the RDO processes. As compared to the default encoding scheme in HEVC test model HM 4.0, experimental results show that the fast intra mode decision scheme provides almost 20% time savings in all intra low complexity cases on average with negligible loss of coding efficiency.


Signal Processing-image Communication | 2009

Joint video/depth rate allocation for 3D video coding based on view synthesis distortion model

Yanwei Liu; Qingming Huang; Siwei Ma; Debin Zhao; Wen Gao

Joint video/depth rate allocation is an important optimization problem in 3D video coding. To address this problem, this paper proposes a distortion model to evaluate the synthesized view without access to the captured original view. The proposed distortion model is an additive model that accounts for the video-coding-induced distortion and the depth-quantization-induced distortion, as well as the inherent geometry distortion. Depth-quantization-induced distortion not only considers the warping error distortion, which is described by a piecewise linear model with the video power spectral property, but also takes into account the warping error correlation distortion between two sources reference views. Geometry distortion is approximated from that of the adjacent view synthesis. Based on the proposed distortion model, a joint rate allocation method is proposed to seek the optimal trade-off between video bit-rate and depth bit-rate for maximizing the view synthesis quality. Experimental results show that the proposed distortion model is capable of approximately estimating the actual distortion for the synthesized view, and that the proposed rate allocation method can almost achieve the identical rate allocation performance as the full-search method at less computational cost. Moreover, the proposed rate allocation method consumes less computational cost than the hierarchical-search method at high bit-rates while providing almost the equivalent rate allocation performance.


IEEE Transactions on Circuits and Systems for Video Technology | 2012

SSIM-Motivated Rate-Distortion Optimization for Video Coding

Shiqi Wang; Abdul Rehman; Zhou Wang; Siwei Ma; Wen Gao

We propose a rate-distortion optimization (RDO) scheme based on the structural similarity (SSIM) index, which was found to be a better indicator of perceived image quality than mean-squared error, but has not been fully exploited in the context of image and video coding. At the frame level, an adaptive Lagrange multiplier selection method is proposed based on a novel reduced-reference statistical SSIM estimation algorithm and a rate model that combines the side information with the entropy of the transformed residuals. At the macroblock level, the Lagrange multiplier is further adjusted based on an information theoretical approach that takes into account both the motion information content and perceptual uncertainty of visual speed perception. Finally, the mode for H.264/AVC coding is selected by the SSIM index and the adjusted Lagrange multiplier. Extensive experiments show that the proposed scheme can achieve significantly better rate-SSIM performance and provide better visual quality than conventional RDO coding schemes.


IEEE Transactions on Image Processing | 2013

Perceptual Video Coding Based on SSIM-Inspired Divisive Normalization

Shiqi Wang; Abdul Rehman; Zhou Wang; Siwei Ma; Wen Gao

We propose a perceptual video coding framework based on the divisive normalization scheme, which is found to be an effective approach to model the perceptual sensitivity of biological vision, but has not been fully exploited in the context of video coding. At the macroblock (MB) level, we derive the normalization factors based on the structural similarity (SSIM) index as an attempt to transform the discrete cosine transform domain frame residuals to a perceptually uniform space. We further develop an MB level perceptual mode selection scheme and a frame level global quantization matrix optimization method. Extensive simulations and subjective tests verify that, compared with the H.264/AVC video coding standard, the proposed method can achieve significant gain in terms of rate-SSIM performance and provide better visual quality.


international conference on multimedia and expo | 2004

Overview of AVS video standard

Liang Fan; Siwei Ma; Feng Wu

The paper overviews the AVS video standard (developed by the Audio Video Coding Standard Working Group of China) in terms of basic features, adopted major techniques and its performance. The current version of the AVS video standard mainly aims at the increasing demand for high definition and high quality video services. It provides a good trade-off solution between complexity and coding efficiency for digital broadcast and digital storage media. Furthermore, having a syntax structure similar to that of the MPEG-2 video standard, it can be easily applied into existing MPEG-2 systems, while significantly improving coding efficiency.


IEEE Transactions on Circuits and Systems for Video Technology | 2014

Image Restoration Using Joint Statistical Modeling in a Space-Transform Domain

Jian Zhang; Debin Zhao; Ruiqin Xiong; Siwei Ma; Wen Gao

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-fold. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving the image inverse problem is formulated using JSM under a regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split Bregman-based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring, and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.


IEEE Journal on Emerging and Selected Topics in Circuits and Systems | 2012

Image Compressive Sensing Recovery via Collaborative Sparsity

Jian Zhang; Debin Zhao; Chen Zhao; Ruiqin Xiong; Siwei Ma; Wen Gao

Compressive sensing (CS) has drawn quite an amount of attention as a joint sampling and compression approach. Its theory shows that when the signal is sparse enough in some domain, it can be decoded from many fewer measurements than suggested by the Nyquist sampling theory. So one of the most challenging researches in CS is to seek a domain where a signal can exhibit a high degree of sparsity and hence be recovered faithfully. Most of the conventional CS recovery approaches, however, exploited a set of fixed bases (e.g., DCT, wavelet, and gradient domain) for the entirety of a signal, which are irrespective of the nonstationarity of natural signals and cannot achieve high enough degree of sparsity, thus resulting in poor rate-distortion performance. In this paper, we propose a new framework for image compressive sensing recovery via collaborative sparsity, which enforces local 2-D sparsity and nonlocal 3-D sparsity simultaneously in an adaptive hybrid space-transform domain, thus substantially utilizing intrinsic sparsity of natural images and greatly confining the CS solution space. In addition, an efficient augmented Lagrangian-based technique is developed to solve the above optimization problem. Experimental results on a wide range of natural images are presented to demonstrate the efficacy of the new CS recovery strategy.


Journal of Visual Communication and Image Representation | 2006

Adaptive rate control for H.264

Zhengguo Li; Wen Gao; Feng Pan; Siwei Ma; Keng Pang Lim; Genan Feng; Xiao Lin; Susanto Rahardja; Hanqing Lu; Yan Lu

This paper presents a rate control scheme for H.264 by introducing the concept of basic unit and a linear prediction model. The basic unit can be a macroblock (MB), a slice, or a frame. It can be used to obtain a trade-off between the overall coding efficiency and the bits fluctuation. The linear model is used to solve the chicken and egg dilemma existing in the rate control of H.264. Both constant bit rate (CBR) and variable bit rate (VBR) cases are studied. Our scheme has been adopted by H.264.


IEEE Transactions on Image Processing | 2013

Compression Artifact Reduction by Overlapped-Block Transform Coefficient Estimation With Block Similarity

Xinfeng Zhang; Ruiqin Xiong; Xiaopeng Fan; Siwei Ma; Wen Gao

Block transform coded images usually suffer from annoying artifacts at low bit rates, caused by the coarse quantization of transform coefficients. In this paper, we propose a new method to reduce compression artifacts by the overlapped-block transform coefficient estimation from non-local blocks. In the proposed method, the discrete cosine transform coefficients of each block are estimated by adaptively fusing two prediction values based on their reliabilities. One prediction is the quantized values of coefficients decoded from the compressed bitstream, whose reliability is determined by quantization steps. The other prediction is the weighted average of the coefficients in nonlocal blocks, whose reliability depends on the variance of the coefficients in these blocks. The weights are used to distinguish the effectiveness of the coefficients in nonlocal blocks to predict original coefficients and are determined by block similarity in transform domain. To solve the optimization problem, the overlapped blocks are divided into several subsets. Each subset contains nonoverlapped blocks covering the whole image and is optimized independently. Therefore, the overall optimization is reduced to a set of sub-optimization problems, which can be easily solved. Finally, we provide a strategy for parameter selection based on the compression levels. Experimental results show that the proposed method can remarkably reduce compression artifacts and significantly improve both the subjective and objective qualities of block transform coded images.

Collaboration


Dive into the Siwei Ma's collaboration.

Top Co-Authors

Avatar

Wen Gao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Debin Zhao

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Shiqi Wang

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Xinfeng Zhang

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiaopeng Fan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge