Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zulin Wang is active.

Publication


Featured researches published by Zulin Wang.


Signal Processing-image Communication | 2015

Weight-based R-λ rate control for perceptual HEVC coding on conversational videos

Shengxi Li; Mai Xu; Xin Deng; Zulin Wang

This paper proposes a novel weight-based R-λ scheme for rate control in HEVC, to improve the perceived visual quality of conversational videos. For rate control in HEVC, the conventional R-λ scheme is based on bit per pixel (bpp) to allocate bits. However, bpp does not reflect the visual importance variation of pixels. Therefore, we propose a novel weight-based R-λ scheme to consider this visual importance for rate control in HEVC. We first conducted an eye-tracking experiment on training videos to figure out different importance of background, face, and facial features, thus generating weight maps of encoding videos. Upon the weight maps, our scheme is capable of allocating more bits to the face (especially facial features), using a new term bit per weight. Consequently, the visual quality of face and facial features can be improved such that perceptual video coding is achieved for HEVC, as verified by our experimental results. HighlightsOur work is based on the latest R-λ rate control scheme.Besides face regions, the regions inside the faces are treated with unequal importance.We conduct eye tracking experiments to learn the importance weights of different regions from training videos.The bits are perceptually allocated according to the term of bit per weight (bpw).We introduce the objective and subjective quality assessments together to make experimental results more reasonable.


IEEE Transactions on Circuits and Systems for Video Technology | 2016

Subjective-Driven Complexity Control Approach for HEVC

Xin Deng; Mai Xu; Lai Jiang; Xiaoyan Sun; Zulin Wang

The latest High Efficiency Video Coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency, compared with the preceding H.264/Advanced Video Coding (AVC) standard. In this paper, we present a novel subjective-driven complexity control (SCC) approach to reduce and control the encoding complexity of HEVC. Through reasonably adjusting the maximum depth of each largest coding unit (LCU), the encoding complexity can be reduced to a target level with minimal visual distortion. Specifically, the maximum depths of different LCUs can be varied through solving the proposed optimization formulation of complexity control, based on two explored relationships: 1) the relationship between the maximum depth and encoding complexity and 2) the relationship between the maximum depth and visual distortion. Besides, the subjective visual quality is favored with a novel subjective-driven constraint imposed in the formulation, on the basis of a visual attention model. Finally, the experimental results show that our approach can achieve a wide range of encoding complexity control (as low as 20%) for HEVC, with the smallest complexity bias being 0.2%. Meanwhile, our SCC approach outperforms other two state-of-the-art complexity control approaches, in terms of both control accuracy and visual quality.


international conference on computer vision | 2015

Learning to Predict Saliency on Face Images

Mai Xu; Yun Ren; Zulin Wang

This paper proposes a novel method, which learns to detect saliency of face images. To be more specific, we obtain a database of eye tracking over extensive face images, via conducting an eye tracking experiment. With analysis on eye tracking database, we verify that the fixations tend to cluster around facial features, when viewing images with large faces. For modeling attention on faces and facial features, the proposed method learns the Gaussian mixture model (GMM) distribution from the fixations of eye tracking data as the top-down features for saliency detection of face images. Then, in our method, the top-down features (i.e., face and facial features) upon the the learnt GMM are linearly combined with the conventional bottom-up features (i.e., color, intensity, and orientation), for saliency detection. In the linear combination, we argue that the weights corresponding to top-down feature channels depend on the face size in images, and the relationship between the weights and face size is thus investigated via learning from the training eye tracking data. Finally, experimental results show that our learning-based method is able to advance state-of-the-art saliency prediction for face images. The corresponding database and code are available online: www.ee.buaa.edu.cn/xumfiles/saliency_detection.html.


IEEE Transactions on Image Processing | 2017

Learning to Detect Video Saliency With HEVC Features

Mai Xu; Lai Jiang; Xiaoyan Sun; Zhaoting Ye; Zulin Wang

Saliency detection has been widely studied to predict human fixations, with various applications in computer vision and image processing. For saliency detection, we argue in this paper that the state-of-the-art High Efficiency Video Coding (HEVC) standard can be used to generate the useful features in compressed domain. Therefore, this paper proposes to learn the video saliency model, with regard to HEVC features. First, we establish an eye tracking database for video saliency detection, which can be downloaded from https://github.com/remega/video_database. Through the statistical analysis on our eye tracking database, we find out that human fixations tend to fall into the regions with large-valued HEVC features on splitting depth, bit allocation, and motion vector (MV). In addition, three observations are obtained with the further analysis on our eye tracking database. Accordingly, several features in HEVC domain are proposed on the basis of splitting depth, bit allocation, and MV. Next, a kind of support vector machine is learned to integrate those HEVC features together, for video saliency detection. Since almost all video data are stored in the compressed form, our method is able to avoid both the computational cost on decoding and the storage cost on raw data. More importantly, experimental results show that the proposed method is superior to other state-of-the-art saliency detection methods, either in compressed or uncompressed domain.Saliency detection has been widely studied to predict human fixations, with various applications in computer vision and image processing. For saliency detection, we argue in this paper that the state-of-the-art High Efficiency Video Coding (HEVC) standard can be used to generate the useful features in compressed domain. Therefore, this paper proposes to learn the video saliency model, with regard to HEVC features. First, we establish an eye tracking database for video saliency detection, which can be downloaded from https://github.com/remega/video_database. Through the statistical analysis on our eye tracking database, we find out that human fixations tend to fall into the regions with large-valued HEVC features on splitting depth, bit allocation, and motion vector (MV). In addition, three observations are obtained with the further analysis on our eye tracking database. Accordingly, several features in HEVC domain are proposed on the basis of splitting depth, bit allocation, and MV. Next, a kind of support vector machine is learned to integrate those HEVC features together, for video saliency detection. Since almost all video data are stored in the compressed form, our method is able to avoid both the computational cost on decoding and the storage cost on raw data. More importantly, experimental results show that the proposed method is superior to other state-of-the-art saliency detection methods, either in compressed or uncompressed domain.


Pattern Recognition | 2016

Bottom-up saliency detection with sparse representation of learnt texture atoms

Mai Xu; Lai Jiang; Zhaoting Ye; Zulin Wang

This paper proposes a saliency detection method by exploring a novel low level feature on sparse representation of learnt texture atoms (SR-LTA). The learnt texture atoms are encoded in salient and non-salient dictionaries. For salient dictionary, a formulation is proposed to learn salient texture atoms from image patches attracting extensive attention. Then, the online salient dictionary learning (OSDL) algorithm is presented to solve the proposed formulation. Similarly, the non-salient dictionary is learnt from image patches without any attention. Then, the pixel-wise SR-LTA feature is yielded based on the difference of sparse representation errors, regarding the learnt salient and non-salient dictionaries. Finally, image saliency can be predicted by linearly combining the proposed SR-LTA feature and conventional features, luminance and contrast. For the linear combination, the weights of different feature channels are determined by least square estimation on the training data. The experimental results show that our method outperforms 9 state-of-the-art methods for bottom-up saliency detection. HighlightsWe develop the OSDL algorithm for learning the salient and non-salient dictionaries.We propose the SR-LTA feature for bottom-up saliency detection, in light of the learnt salient and non-salient dictionaries.We validate that the proposed SR-LTA feature can advance state-of-the-art saliency detection on natural images.


international conference on multimedia and expo | 2015

A novel method on optimal bit allocation at LCU level for rate control in HEVC

Shengxi Li; Mai Xu; Zulin Wang

In this paper, we propose a new method, namely recursive Taylor expansion (RTE) method, for optimally allocating bits to each LCU in the R-λ rate control scheme for HEVC. Specifically, we first set up an optimization formulation on optimal bit allocation. Unfortunately, it is intractable to achieve a closed-form solution for this formulation. We therefore propose a RTE solution to iteratively solve the formulation with a fast convergence speed. Then, an approximate closed-form solution can be obtained. This way, the optimal bit allocation can be achieved at little encoding complexity cost. Finally, the experimental results validate the effectiveness of our method in three aspects: compressed distortion, bit-rate control error, and bit fluctuation.


international conference on multimedia and expo | 2017

Decoder-side HEVC quality enhancement with scalable convolutional neural network

Ren Yang; Mai Xu; Zulin Wang

The latest High Efficiency Video Coding (HEVC) has been increasingly used to generate video streams over Internet. However, the decoded HEVC video streams may incur severe quality degradation, especially at low bit-rates. Thus, it is necessary to enhance visual quality of HEVC videos at the decoder side. To this end, we propose in this paper a Decoder-side Scalable Convolutional Neural Network (DS-CNN) approach to achieve quality enhancement for HEVC, which does not require any modification of the encoder. In particular, our DS-CNN approach learns a model of Convo-lutional Neural Network (CNN) to reduce distortion of both I and B/P frames in HEVC. It is different from the existing CNN-based quality enhancement approaches, which only handle intra coding distortion, thus not suitable for B/P frames. Furthermore, a scalable structure is included in our DS-CNN, suchthat the computational complexity of our DS-CNN approach is adjustable to the changing computational resources. Finally, the experimental results show the effectiveness of our DS-CNN approach in enhancing quality for both I and B/P frames of HEVC.


international conference on multimedia and expo | 2014

A novel weight-based URQ scheme for perceptual video coding of conversational video in HEVC

Shengxi Li; Mai Xu; Xin Deng; Zulin Wang

In this paper, we propose a novel weight-based unified rate-quantization (URQ) scheme for rate control in state-of-the-art HEVC standard, to improve its perceived visual quality for conversational videos. In conventional rate control of HEVC, a pixel-wise URQ scheme is proposed by introducing the concept of bit per pixel (bpp). This scheme is able to assign different amounts of bits to the blocks with various sizes, thus well suitable for flexible picture partition of HEVC. However, bpp does not reflect the visual importance of each pixel. Therefore, we propose a novel weight-based URQ scheme to take into account the visual importance for rate control in HEVC. In combination with the weight map acquired from a novel hierarchical perceptual model of face, such a scheme is capable of allocating more bits to the face and much more bits to the facial features, by using bit per weight (bpw) instead of bpp. As a result, the visual quality of face, especially facial features, can be improved such that perceptual video coding is achieved for HEVC. Finally, the experimental results validate such improvement.


international conference on multimedia and expo | 2017

A novel rate control scheme for panoramic video coding

Yufan Liu; Mai Xu; Chen Li; Shengxi Li; Zulin Wang

The popularity of multi-view panoramic videos has been considerably increased for producing Virtual Reality (VR) content, due to its immersive visual experience. We argue in this paper that PSNR is less effective in assessing visual quality of compressed panoramic videos than Sphere-based PSNR (S-PNSR), in which sphere-to-plain mapping of panoramic videos is considered. Thus, the conventional rate control (R-C) schemes of 2-Dimensional (2D) video coding, which optimize on PSNR, are not suitable for panoramic video coding. To optimize S-PSNR, we propose in this paper a novel RC scheme for panoramic video coding. Specifically, we develop an S-PSNR optimization formulation with constraint on bit-rate. Then, a solution is provided to the developed formulation, such that bits can be allocated to each coding block for achieving optimal S-PSNR in panoramic video coding. Finally, the experiment results validate the effectiveness of the proposed RC scheme in improving S-PSNR of panoramic video coding.


international conference on computer vision | 2015

Image Saliency Detection with Sparse Representation of Learnt Texture Atoms

Lai Jiang; Mai Xu; Zhaoting Ye; Zulin Wang

This paper proposes a saliency detection method using a novel feature on sparse representation of learnt texture atoms (SR-LTA), which are encoded in salient and non-salient dictionaries. For salient dictionary, a novel formulation is proposed to learn salient texture atoms from image patches attracting extensive attention. Then, online salient dictionary learning (OSDL) algorithm is provided to solve the proposed formulation. Similarly, the non-salient dictionary can be learnt from image patches without any attention. A new pixel-wise feature, namely SR-LTA, is yielded based on the difference of sparse representation errors regarding the learnt salient and non-salient dictionaries. Finally, image saliency can be predicted via linear combination of the proposed SR-LTA feature and conventional features, i.e., luminance and contrast. For the linear combination, the weights corresponding to different feature channels are determined by least square estimation on the training data. The experimental results show that our method outperforms several state-of-the-art saliency detection methods.

Collaboration


Dive into the Zulin Wang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge