Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yaowu Xu is active.

Publication


Featured researches published by Yaowu Xu.


international conference on multimedia and expo | 2011

Technical overview of VP8, an open source video codec for the web

Jim Bankoski; Paul Wilkins; Yaowu Xu

VP8 is an open source video compression format supported by a consortium of technology companies. This paper provides a technical overview of the format, with an emphasis on its unique features. The paper also discusses how these features benefit VP8 in achieving high compression efficiency and low decoding complexity at the same time.


picture coding symposium | 2013

The latest open-source video codec VP9 - An overview and preliminary results

Debargha Mukherjee; Jim Bankoski; Adrian Grange; Jingning Han; John Koleszar; Paul Wilkins; Yaowu Xu; Ronald Sebastiaan Bultje

Google has recently finalized a next generation open-source video codec called VP9, as part of the libvpx repository of the WebM project (http://www.webmproject.org/). Starting from the VP8 video codec released by Google in 2010 as the baseline, various enhancements and new tools were added, resulting in the next-generation VP9 bit-stream. This paper provides a brief technical overview of VP9 along with comparisons with other state-of-the-art video codecs H.264/AVC and HEVC on standard test sets. Results show VP9 to be quite competitive with mainstream state-of-the-art codecs.


Proceedings of SPIE | 2013

Towards a next generation open-source video codec

Jim Bankoski; Ronald Sebastiaan Bultje; Adrian Grange; Qunshan Gu; Jingning Han; John Koleszar; Debargha Mukherjee; Paul Wilkins; Yaowu Xu

Google has recently been developing a next generation opensource video codec called VP9, as part of the experimental branch of the libvpx repository included in the WebM project (http://www.webmproject.org/). Starting from the VP8 video codec released by Google in 2010 as the baseline, a number of enhancements and new tools have been added to improve the coding efficiency. This paper provides a technical overview of the current status of this project along with comparisons and other stateoftheart video codecs H. 264/AVC and HEVC. The new tools that have been added so far include: larger prediction block sizes up to 64x64, various forms of compound INTER prediction, more modes for INTRA prediction, ⅛pel motion vectors and 8tap switchable subpel interpolation filters, improved motion reference generation and motion vector coding, improved entropy coding and framelevel entropy adaptation for various symbols, improved loop filtering, incorporation of Asymmetric Discrete Sine Transforms and larger 16x16 and 32x32 DCTs, frame level segmentation to group similar areas together, etc. Other tools and various bitstream features are being actively worked on as well. The VP9 bitstream is expected to be finalized by earlyto mid2013. Results show VP9 to be quite competitive in performance with mainstream stateoftheart codecs.


Proceedings of SPIE | 2015

An overview of new video coding tools under consideration for VP10: the successor to VP9

Debargha Mukherjee; Hui Su; James Bankoski; Alex Converse; Jingning Han; Zoe Liu; Yaowu Xu

Google started an opensource project, entitled the WebM Project, in 2010 to develop royaltyfree video codecs for the web. The present generation codec developed in the WebM project called VP9 was finalized in mid2013 and is currently being served extensively by YouTube, resulting in billions of views per day. Even though adoption of VP9 outside Google is still in its infancy, the WebM project has already embarked on an ambitious project to develop a next edition codec VP10 that achieves at least a generational bitrate reduction over the current generation codec VP9. Although the project is still in early stages, a set of new experimental coding tools have already been added to baseline VP9 to achieve modest coding gains over a large enough test set. This paper provides a technical overview of these coding tools.


Proceedings of SPIE | 2014

An optimized template matching approach to intra coding in video/image compression

Hui Su; Jingning Han; Yaowu Xu

The template matching prediction is an established approach to intra-frame coding that makes use of previously coded pixels in the same frame for reference. It compares the previously reconstructed upper and left boundaries in searching from the reference area the best matched block for prediction, and hence eliminates the need of sending additional information to reproduce the same prediction at decoder. In viewing the image signal as an auto-regressive model, this work is premised on the fact that pixels closer to the known block boundary are better predicted than those far apart. It significantly extends the scope of the template matching approach, which is typically followed by a conventional discrete cosine transform (DCT) for the prediction residuals, by employing an asymmetric discrete sine transform (ADST), whose basis functions vanish at the prediction boundary and reach maximum magnitude at far end, to fully exploit statistics of the residual signals. It was experimentally shown that the proposed scheme provides substantial coding performance gains on top of the conventional template matching method over the baseline.


international conference on image processing | 2016

A dynamic motion vector referencing scheme for video coding

Jingning Han; Yaowu Xu; James Bankoski

Video codecs exploit temporal redundancy in video signals, through the use of motion compensated prediction, to achieve superior compression performance. The coding of motion vectors takes a large portion of the total rate cost. Prior research utilizes the spatial and temporal correlation of the motion field to improve the coding efficiency of the motion information. It typically constructs a candidate pool composed of a fixed number of reference motion vectors and allows the codec to select and reuse the one that best approximates the motion of the current block. This largely disconnects the entropy coding process from the blocks motion information, and throws out any information related to motion consistency, leading to sub-optimal coding performance. An alternative motion vector referencing scheme is proposed in this work to fully accommodate the dynamic nature of the motion field. It adaptively extends or shortens the candidate list according to the actual number of available reference motion vectors. The associated probability model accounts for the likelihood that an individual motion vector candidate is used. A complementary motion vector candidate ranking system is also presented here. It is experimentally shown that the proposed scheme achieves about 1.6% compression performance gains on a wide range of test clips.


international conference on acoustics, speech, and signal processing | 2017

A constrained adaptive scan order approach to transform coefficient entropy coding

Ching-Han Chiang; Jingning Han; Yaowu Xu

Transform coefficient coding is a key module in modern video compression systems. Typically, a block of the quantized coefficients are processed in a pre-defined zig-zag order, starting from DC and sweeping through low frequency positions to high frequency ones. Correlation between magnitudes of adjacent coefficients is exploited via context based probability models to improve compression efficiency. Such scheme is premised on the assumption that spatial transforms compact energy towards lower frequency coefficients, and the scan pattern that follows a descending order of the likelihood of coefficients being non-zero provides more accurate probability modeling. However, a pre-defined zig-zag pattern that is agnostic to signal statistics may not be optimal. This work proposes an adaptive approach to generate scan pattern dynamically. Unlike prior attempts that directly sort a 2-D array of coefficient positions according to the appearance frequency of non-zero levels only, the proposed scheme employs a topological sort that also fully accounts for the spatial constraints due to the context dependency in entropy coding. A streamlined framework is designed for processing both intra and inter prediction residuals. This generic approach is experimentally shown to provide consistent coding performance gains across a wide range of test settings.


electronic imaging | 2017

Adaptive multi-reference prediction using a symmetric framework

Zoe Liu; Debargha Mukherjee; Wei-Ting Lin; Paul Wilkins; Jingning Han; Yaowu Xu

Google started the WebM Project in 2010 to develop open source, royalty--free video codecs designed specifically for media on the Web. Subsequently, Google jointly founded a consortium of major tech companies called the Alliance for Open Media (AOM) to develop a new codec AV1, aiming at a next edition codec that achieves at least a generational improvement in coding efficiency over VP9. This paper proposes a new coding tool as one of the many efforts devoted to AOM/AV1. In particular, we propose a second ALTREF_FRAME in the AV1 syntax, which brings the total reference frames to seven on top of the work presented in [11]. ALTREF_FRAME is a constructed, no-show reference obtained through temporal filtering of a look-ahead frame. The use of two ALTREF_FRAMEs adds further flexibility to the multilayer, multi-reference symmetric framework, and provides a great potential for the overall RateDistortion (RD) performance enhancement. The experimental results have been collected over several video test sets of various resolutions and characteristics both textureand motion-wise, which demonstrate that the proposed approach achieves a consistent coding gain, compared against the AV1 baseline as well as against the results in [11]. For instance, using overallPSNR as the distortion metric, an average bitrate saving of 5.880% in BDRate is obtained for the CIF-level resolution set, and 4.595% on average for the VGAlevel resolution set.


Applications of Digital Image Processing XL | 2017

Novel inter and intra prediction tools under consideration for the emerging AV1 video codec

Urvang Joshi; Debargha Mukherjee; Jingning Han; Yue Chen; Sarah Parker; Hui Su; Angie Chiang; Yaowu Xu; Zoe Liu; Yunqing Wang; Jim Bankoski; Chen Wang; Emil Keyder

Google started the WebM Project in 2010 to develop open source, royalty- free video codecs designed specifically for media on the Web. The second generation codec released by the WebM project, VP9, is currently served by YouTube, and enjoys billions of views per day. Realizing the need for even greater compression efficiency to cope with the growing demand for video on the web, the WebM team embarked on an ambitious project to develop a next edition codec AV1, in a consortium of major tech companies called the Alliance for Open Media, that achieves at least a generational improvement in coding efficiency over VP9. In this paper, we focus primarily on new tools in AV1 that improve the prediction of pixel blocks before transforms, quantization and entropy coding are invoked. Specifically, we describe tools and coding modes that improve intra, inter and combined inter-intra prediction. Results are presented on standard test sets.


international conference on image processing | 2016

A staircase transform coding scheme for screen content video coding

Cheng Chen; Jingning Han; Yaowu Xu; James Bankoski

Demand for screen content videos that contain computer generated text and graphics is growing. They are very different from natural videos, because they include much sharper edge transitions and very repetitive patterns. On this type of material, the efficacy of the conventional discrete cosine transform (DCT) is questionable because it relies on the assumption that a Gauss-Markov model leads to a base-band signal. However, the assumption may not hold true for screen content material. This work exploits a class of staircase transforms. Unlike the DCT whose bases are samplings of sinusoidal functions, the staircase transforms have their bases sampled from staircase functions, which better approximate the sharp transitions often encountered in the context of screen content. The staircase transform is integrated into a hybrid transform coding scheme, in conjunction with DCT. It is experimentally shown that the proposed approach provides an average of 2.9% compression performance gains in terms of BD-rate reduction. A perceptual comparison further demonstrates that the use of staircase transform achieves substantial reduction in ringing artifact due to the Gibbs phenomenon.

Collaboration


Dive into the Yaowu Xu's collaboration.

Researchain Logo
Decentralizing Knowledge