Yih Han Tan
National University of Singapore
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yih Han Tan.
IEEE Transactions on Circuits and Systems for Video Technology | 2012
Chuohao Yeo; Yih Han Tan; Zhengguo Li; Susanto Rahardja
The use of mode-dependent transforms for coding directional intra prediction residuals has been previously shown to provide coding gains, but the transform matrices have to be derived from training. In this paper, we derive a set of separable mode-dependent transforms by using a simple separable, directional, and anisotropic image correlation model. Our analysis shows that only one additional transform, the odd type-3 discrete sine transform (ODST-3), is required for the optimal implementation of mode-dependent transforms. In addition, the four-point ODST-3 also has a structure that can be exploited to reduce the operation count of the transform operation. Experimental results show that in terms of coding efficiency, our proposed approach matches or improves upon the performance of a mode-dependent transforms approach that uses transform matrices obtained through training.
IEEE Transactions on Circuits and Systems for Video Technology | 2013
Chuohao Yeo; Hui Li Tan; Yih Han Tan
In this paper, we present a method for performing rate-distortion optimization (RDO) using a perceptual visual quality metric, the structural similarity index (SSIM), as the target of optimization. Rate-distortion optimization is widely used in modern video codecs to make various encoder decisions to optimize the rate-distortion tradeoff. Typically, the distortion measure used is either sum-of-square error or sum-of-absolute distance, both of which are convenient when used in the RDO framework but not always reflective of a perceptual visual quality. We show that SSIM can be used as the distortion metric in the RDO framework in a simple, yet effective, manner by scaling the Lagrange multiplier used in RDO based on the local variance in that region. The experimental results on the H.264/AVC reference software show that compared to traditional RDO approaches, for the same SSIM score, the proposed approach can achieve an average rate reduction of about 9% and 14% for random access and low-delay encoding configurations. At the same time, there is no significant change in the encoding runtime.
IEEE Transactions on Image Processing | 2013
Hui Li Tan; Zhengguo Li; Yih Han Tan; Susanto Rahardja; Chuohuo Yeo
Image quality metrics (IQMs), such as the mean squared error (MSE) and the structural similarity index (SSIM), are quantitative measures to approximate perceived visual quality. In this paper, through analyzing the relationship between the MSE and the SSIM under an additive noise distortion model, we propose a perceptually relevant MSE-based IQM, MSE-SSIM, which is expressed in terms of the variance of the source image and the MSE between the source and distorted images. Evaluations on three publicly available databases (LIVE, CSIQ, and TID2008) show that the proposed metric, despite requiring less computation, compares favourably in performance to several existing IQMs. In addition, due to its simplicity, MSE-SSIM is amenable for the use in a wide range of image and video tasks that involve solving an optimization problem. As an example, MSE-SSIM is used as the objective function in designing a Wiener filter that aims at optimizing the perceptual visual quality of the output. Experimental results show that the images filtered with a MSE-SSIM-optimal Wiener filter have better visual quality than those filtered with a MSE-optimal Wiener filter.
international symposium on circuits and systems | 2011
Chuohao Yeo; Yih Han Tan; Zhengguo Li; Susanto Rahardja
In this paper, we derive separable KLTs for coding H.264/AVC intra prediction residuals, using a simple image correlation model. Our analysis shows that for some intra prediction modes, we can in fact just use the DCT for performing either the row-wise or column-wise transform. Furthermore, we also compute the KLT that should be used based on the image correlation model, which happens to have sinosuidal terms. The 4 × 4 transform also has a structure that can be exploited to reduce the operation count of the transform operation. In our simplified implementation of mode-dependent directional transforms (MDDT), we only need to make use of two matrices: the DCT and the derived KLT. Our experimental results show that in terms of coding efficiency, our proposed approach has similar performance when compared with MDDT. More importantly, compared to MDDT, our approach requires no training and has lower computational and storage costs.
international conference on image processing | 2011
Chuohao Yeo; Yih Han Tan; Zhengguo Li
Applying mode-dependent separable transforms, e.g., mode-dependent directional transform (MDDT), is an effective method for improving transform coding of intra prediction residuals. However, two transform matrices typically need to be stored for each intra prediction mode. By using a simple image correlation mode, we have previously derived and proposed a simplified mode-dependent separable transforms scheme that uses a combination of two well-known transforms: Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST). In this paper, we propose an orthogonal 4-point integer DST that has a multiplier-less implementation consisting of only adds and bit-shifts. We also propose a simple set of mode-dependent scans for coefficient coding that can be used on top of mode-dependent transforms. Our experimental results on the current HEVC reference software show that in terms of coding efficiency, our proposed approach has comparable performance to MDDT. More importantly, compared to MDDT, our approach requires no training and has lower computational and storage costs.
international conference on acoustics, speech, and signal processing | 2013
Chuohao Yeo; Hui Li Tan; Yih Han Tan
HEVC is an emerging video coding standard that can achieve significant compression gains compared to H.264/AVC due to the inclusion of numerous new coding tools. In particular, it allows for a flexible quadtree based block partitioning of each coding tree unit (CTU) and an ability to switch quantization parameters (QP) on a sub-CTU level. In this paper, we present an approach for selecting quantization parameters for each block of pixels on the basis of optimizing the SSIM of the entire picture. Our simulation results show that when SSIM is the quality metric, the proposed approach is able to give average BD-Rate gains of 5.5% to 7.4% compared to using a constant QP per picture while having a negligible increase in encoding runtime. In addition, our proposed method also significantly outperforms the MPEG-2 TM5 adaptive quantization algorithm implemented in the HEVC reference software.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Yih Han Tan; Wei Siong Lee; Jo Yew Tham; Susanto Rahardja; Kin Mun Lye
The H.264/AVC video coding standard encapsulates the most advanced video coding tools. Since the various techniques that lead to better coding efficiency of the coding standard also inevitably increase the complexity of the video encoder, real-time H.264 encoding of video streams is a challenging task. If available computational resource does not allow the entire encoding process to be carried out in time, a complexity scalable technique that ensures a graceful degradation of coding performance will be a valuable tool. We designed a video encoding scheme that allows the rate distortion (R-D) process to be carried out in a complexity scalable fashion. Our proposed singularly parameterized complexity scalable scheme allows the control of complexity-coding performance tradeoff when available resources are limited and the optimal R-D performance is unattainable.
international conference on acoustics, speech, and signal processing | 2013
Yih Han Tan; Chuohao Yeo; Zhengguo Li
Incorporating sample-based prediction during lossless coding can significantly improve coding performance. However, its use within a codec designed for lossy coding requires a modification of the available prediction scheme. When implementing the codec, two different prediction processes will have to be implemented. This paper describes a lossless coding scheme that delays the sample-based prediction till the residue coding stage of the codec and carries out prediction in the residual domain. In this way, the prediction scheme of the lossy coder can be retained while realizing the coding gains associated with sample-based prediction. The proposed scheme improves lossless intra coding performance in HEVC Main Profile by an average of 6.5%.
international conference on image processing | 2009
Yih Han Tan; Wei Siong Lee; Jo Yew Tham; Susanto Rahardja
The H.264/AVC video coding standard encapsulates the most advanced video coding tools. The various techniques that lead to better coding efficiency of the coding standard also inevitably increase the complexity of the video encoder. Thus, real-time encoding of video streams with H.264 coding standard is a challenging task. If available computational resource does not allow the entire encoding process to be carried out in time, a complexity scalable technique that ensures a graceful degradation of coding performance will be a useful tool. This work proposes a singularly-parameterized complexity scalable rate-distortion framework for H.264/AVC encoders.
international conference on computer communications and networks | 2009
Yih Han Tan; Wei Siong Lee; Jo Yew Tham; Susanto Rahardja
In this paper, we introduce a singularly-parameterized complexity scalable H.264-compliant encoder. Through modeling its complexity-rate-distortion relationships, we derive optimized operating mode of the encoder (rate and complexity) and show through experiment that such optimization can help a video encoder operate within rate and time constraints. The design of the complexity scalable encoding scheme enables the encoder to perform optimization while taking into consideration the availability of computational resource. This extension of traditional rate-distortion optimization is necessary when time or power constraints do not allow a video encoder to achieve rate-distortion optimized coding performance. Our optimization scheme outputs parameters that allow the encoder to be as close to being rate-distortion optimized as possible, within rate and complexity constraints. Index Terms—H.264, power consumption, complexity