Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hui Li Tan is active.

Publication


Featured researches published by Hui Li Tan.


international conference on acoustics, speech, and signal processing | 2012

On fast coding tree block and mode decision for high-Efficiency Video Coding (HEVC)

Hui Li Tan; Fengjiao Liu; Yih Han Tan; Chuohao Yeo

In the current HEVC test model (HM), a quad-tree based coding tree block (CTB) representation is used to signal mode, partition, prediction and residual information. The large number of combinations of quad-tree partitions and modes to be tested during rate-distortion optimization (RDO) results in a high encoding complexity. In this paper, we investigate and compare a variety of algorithms for fast CTB and mode decision. Experimental results from HM4-based implementations show that different strategies can provide a range of complexity-performance trade-offs. In particular, our proposed CU Depth Pruning algorithm can reduce encoding time by about 10% with only 0.1% coding loss, while a combination of our proposed Early Partition Decision and an early CU termination approach can reduce encoding time by about 40% with about 1% coding loss.


IEEE Transactions on Circuits and Systems for Video Technology | 2013

On Rate Distortion Optimization Using SSIM

Chuohao Yeo; Hui Li Tan; Yih Han Tan

In this paper, we present a method for performing rate-distortion optimization (RDO) using a perceptual visual quality metric, the structural similarity index (SSIM), as the target of optimization. Rate-distortion optimization is widely used in modern video codecs to make various encoder decisions to optimize the rate-distortion tradeoff. Typically, the distortion measure used is either sum-of-square error or sum-of-absolute distance, both of which are convenient when used in the RDO framework but not always reflective of a perceptual visual quality. We show that SSIM can be used as the distortion metric in the RDO framework in a simple, yet effective, manner by scaling the Lagrange multiplier used in RDO based on the local variance in that region. The experimental results on the H.264/AVC reference software show that compared to traditional RDO approaches, for the same SSIM score, the proposed approach can achieve an average rate reduction of about 9% and 14% for random access and low-delay encoding configurations. At the same time, there is no significant change in the encoding runtime.


IEEE Transactions on Image Processing | 2013

A Perceptually Relevant MSE-Based Image Quality Metric

Hui Li Tan; Zhengguo Li; Yih Han Tan; Susanto Rahardja; Chuohuo Yeo

Image quality metrics (IQMs), such as the mean squared error (MSE) and the structural similarity index (SSIM), are quantitative measures to approximate perceived visual quality. In this paper, through analyzing the relationship between the MSE and the SSIM under an additive noise distortion model, we propose a perceptually relevant MSE-based IQM, MSE-SSIM, which is expressed in terms of the variance of the source image and the MSE between the source and distorted images. Evaluations on three publicly available databases (LIVE, CSIQ, and TID2008) show that the proposed metric, despite requiring less computation, compares favourably in performance to several existing IQMs. In addition, due to its simplicity, MSE-SSIM is amenable for the use in a wide range of image and video tasks that involve solving an optimization problem. As an example, MSE-SSIM is used as the objective function in designing a Wiener filter that aims at optimizing the perceptual visual quality of the output. Experimental results show that the images filtered with a MSE-SSIM-optimal Wiener filter have better visual quality than those filtered with a MSE-optimal Wiener filter.


multimedia signal processing | 2011

On residual quad-tree coding in HEVC

Yih Han Tan; Chuohao Yeo; Hui Li Tan; Zhengguo Li

In the current working draft of HEVC, residual quad-tree (RQT) coding is used to encode prediction residuals in both Intra and Inter coding units (CU). However, the rationale for using RQT as a coding tool is different in the two cases. For Intra prediction units, RQT provides an efficient syntax for coding a number of sub-blocks with the same intra prediction mode. For Inter CUs, RQT adapts to the spatial-frequency variations of the CU, using as large a transform size as possible while catering to local variations in residual statistics. While providing coding gains, effective use of RQT currently requires an exhaustive search of all possible combinations of transform sizes within a block. In this paper, we exploit our insights to develop two fast RQT algorithms, each designed to meet the needs of Intra and Inter prediction residual coding.


international conference on acoustics, speech, and signal processing | 2012

On rate distortion optimization using SSIM

Chuohao Yeo; Hui Li Tan; Yih Han Tan

In this paper, we present a method for performing rate-distortion optimization (RDO) using a perceptual visual quality metric, the structural similarity index (SSIM), as the target of optimization. Rate-distortion optimization is widely used in modern video codecs to make various encoder decisions to optimize the rate-distortion tradeoff. Typically, the distortion measure used is either sum-of-square error or sum-of-absolute distance, both of which are convenient when used in the RDO framework but not always reflective of a perceptual visual quality. We show that SSIM can be used as the distortion metric in the RDO framework in a simple, yet effective, manner by scaling the Lagrange multiplier used in RDO based on the local variance in that region. The experimental results on the H.264/AVC reference software show that compared to traditional RDO approaches, for the same SSIM score, the proposed approach can achieve an average rate reduction of about 9% and 14% for random access and low-delay encoding configurations. At the same time, there is no significant change in the encoding runtime.


IEEE Transactions on Broadcasting | 2016

Fast Coding Quad-Tree Decisions Using Prediction Residuals Statistics for High Efficiency Video Coding (HEVC)

Hui Li Tan; Chi Chung Ko; Susanto Rahardja

High Efficiency Video Coding (HEVC) is the latest video coding standard to meet market demands for real-time high quality video codecs. As compared to its predecessor H.264/AVC, HEVC can achieve significant compression gains but with higher encoding complexity. Therefore, for real-time applications, significant encoding time reduction is still necessary. In the HEVC test model, the large number of coding quad-tree decisions to be tested during rate-distortion optimization would result in high encoding time. Hence, we propose a method to reduce the high encoding time by pruning the coding quad-trees using prediction residuals statistics. Experimental results from HM16.3-based implementations show that the proposed residual-based pruning method can reduce encoding time by an average of about 44% with an average of about 1.0% coding loss.


international conference on acoustics, speech, and signal processing | 2013

SSIM-based adaptive quantization in HEVC

Chuohao Yeo; Hui Li Tan; Yih Han Tan

HEVC is an emerging video coding standard that can achieve significant compression gains compared to H.264/AVC due to the inclusion of numerous new coding tools. In particular, it allows for a flexible quadtree based block partitioning of each coding tree unit (CTU) and an ability to switch quantization parameters (QP) on a sub-CTU level. In this paper, we present an approach for selecting quantization parameters for each block of pixels on the basis of optimizing the SSIM of the entire picture. Our simulation results show that when SSIM is the quality metric, the proposed approach is able to give average BD-Rate gains of 5.5% to 7.4% compared to using a constant QP per picture while having a negligible increase in encoding runtime. In addition, our proposed method also significantly outperforms the MPEG-2 TM5 adaptive quantization algorithm implemented in the HEVC reference software.


international symposium on circuits and systems | 2010

Audio onset detection using energy-based and pitch-based processing

Hui Li Tan; Yongwei Zhu; Lekha Chaisorn; Susanto Rahardja

Leveraging on the strength of energy-based processing for transient detection and pitch-based processing for softer onsets detection, we present a system that combines both energy and pitch cues for detecting onsets from different instrument categories. Given an audio input from an arbitrary instrument category, the system performs preliminary onset categorization based on the general note characteristics and then performs onsets integration based on the categorization. In addition, the proposed pitch processing technique explores musically relevant features extracted from the chromagram, which are robust for detecting pitch changes. The proposed system showed good detection performance on the MIREX audio onset detection dataset.


international conference on acoustics, speech, and signal processing | 2013

Perceptually relevant energy function for seam carving

Hui Li Tan; Yih Han Tan; Zhengguo Li; Susanto Rahardja; Chuohao Yeo

Seam carving, an image re-targeting method, works by progressively finding and removing connected paths of low energy pixels in an image until a desired image aspect ratio is reached. In this paper, we first cast the problem of minimizing an energy function as that of minimizing a distortion cost. We then leverage on our understanding of image quality metrics/ distortion metrics in proposing a perceptually relevant energy function. Experimental results show that our proposed energy function can generate more desirable resized images in which the original structures of the images are better preserved.


international conference on multimedia and expo | 2009

Rhythm analysis for personal and social music applications using drum loop patterns

Hui Li Tan; Yongwei Zhu; Susanto Rahardja; Lekha Chaisorn

The development of effective music retrieval and recommendation applications requires meaningful features for characterizing music audio. The rhythm of music audio in particular, besides timbre and melody, is essential in describing the music piece. This paper illustrates how drum loop patterns characterize the salient rhythm structure of music and an approach for drum loop pattern extraction is described. The extracted drum loop patterns are formed based on the temporal regularity (accent on meter) and drum types (bass and snare) of the beats and are good representations of the rhythmic dimension of music. The extracted drum loop patterns are useful for music segmentation and further work would include music classification and similarity matching.

Collaboration


Dive into the Hui Li Tan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chi Chung Ko

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Fengjiao Liu

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge