Yi-Hsin Huang
National Taiwan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yi-Hsin Huang.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Yi-Hsin Huang; Tao-Sheng Ou; Po-Yen Su; Homer H. Chen
The rate-distortion optimization (RDO) framework for video coding achieves a tradeoff between bit-rate and quality. However, objective distortion metrics such as mean squared error traditionally used in this framework are poorly correlated with perceptual quality. We address this issue by proposing an approach that incorporates the structural similarity index as a quality metric into the framework. In particular, we develop a predictive Lagrange multiplier estimation method to resolve the chicken and egg dilemma of perceptual-based RDO and apply it to H.264 intra and inter mode decision. Given a perceptual quality level, the resulting video encoder achieves on the average 9% bit-rate reduction for intra-frame coding and 11% for inter-frame coding over the JM reference software. Subjective test further confirms that, at the same bit-rate, the proposed perceptual RDO indeed preserves image details and prevents block artifact better than traditional RDO.
IEEE Transactions on Circuits and Systems for Video Technology | 2011
Tao-Sheng Ou; Yi-Hsin Huang; Homer H. Chen
The quality of video is ultimately judged by human eye; however, mean squared error and the like that have been used as quality metrics are poorly correlated with human perception. Although the characteristics of human visual system have been incorporated into perceptual-based rate control, most existing schemes do not take rate-distortion optimization into consideration. In this paper, we use the structural similarity index as the quality metric for rate-distortion modeling and develop an optimum bit allocation and rate control scheme for video coding. This scheme achieves up to 25% bit-rate reduction over the JM reference software of H.264. Under the rate-distortion optimization framework, the proposed scheme can be easily integrated with the perceptual-based mode decision scheme. The overall bit-rate reduction may reach as high as 32% over the JM reference software.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Yi-Hsin Huang; Tao-Sheng Ou; Homer H. Chen
The spatial-domain intra prediction scheme of H.264 has high computational complexity, especially for the High Profile as it incorporates the additional intra 8 × 8 prediction mode. To address this issue, we explore the hierarchy of H.264 mode decision process in this paper and adopt an approach that is in synchrony with the mode decision hierarchy. In particular, we propose a variance-based algorithm for block size decision, an improved filter-based algorithm for prediction mode decision using contextual information, and a selection algorithm for intra block decision that exploits the relation between the rate-distortion characteristic and the best coding type. Performance comparison is provided to show the improvement of the proposed algorithms over previous methods.
international conference on multimedia and expo | 2010
Homer H. Chen; Yi-Hsin Huang; Po-Yen Su; Tao-Sheng Ou
The goal of this work is to seek a feasible direction of video coding that can provide a significant quality improvement over H.264. In light of the well-known findings that the distortion metric for video quality has a profound impact on video coding performance and that traditional metrics such as mean square error are poorly correlated with human perception, we identify perceptual video coding, more specifically, perceptual-based rate-distortion optimization, as a sensible approach that has the potential to help drive the performance of video coding to a significantly higher quality level. This technology assessment is supported by experiments with various video sequences, bit-rates, and encoding profiles. The results show that the perceptual-based RDO can indeed bring significant quality improvement for H.264.
visual communications and image processing | 2010
Tao-Sheng Ou; Yi-Hsin Huang; Homer H. Chen
Since the ultimate receivers of encoded video are human eyes, the characteristics of human visual system should be taken into consideration in the design of bit allocation to improve the perceptual video quality. In this paper, we incorporate the structural similarity index as a distortion metric and propose a novel rate-distortion model to characterize the relationship between rate and the structural similarity index. Based on the model, we develop an optimum bit allocation and rate control scheme for H.264 encoders. Experimental results show that up to 25% bitrate reduction over the JM reference software can be achieved. Subjective evaluation further confirms that the proposed scheme preserves more structural information and improves the perceptual quality of the encoded video.
international symposium on circuits and systems | 2010
Yi-Hsin Huang; Tao-Sheng Ou; Homer H. Chen
The framework of rate-distortion optimization (RDO) has been widely adopted for video coding to achieve a good trade-off between bit-rate and distortion. However, objective distortion metrics such as mean square error traditionally used in this framework are poorly correlated with perceptual video quality. To address this issue, we incorporate the structural similarity index as a quality metric into the framework and develop a predictive Lagrange multiplier selection technique to resolve the chicken-and-egg dilemma of perceptual-based RDO. The resulting perceptual-based RDO is then applied to H.264 intra mode decision as an illustration of the application of the proposed technique. Given a perceptual quality level, 5%–10% bit rate reduction over the JM reference software of H.264 is achieved. Subjective evaluation further confirms that, at the same bit-rate, the proposed perceptual RDO preserves image details and prevents block artifact better than the traditional RDO.
picture coding symposium | 2009
Tao-Sheng Ou; Yi-Hsin Huang; Homer H. Chen
The rate-distortion optimization framework employed in H.264 for the selection of best coding mode is computationally expensive, although it achieves remarkable coding efficiency. In this paper, an integrated intra prediction algorithm for intra-frame coding of the H.264 High Profile is proposed. It consists of two components: a variance-based method for macroblock mode decision and an improved filter-based method for prediction mode decision using contextual information. The two components can work independently or jointly to achieve higher efficiency. The integrated algorithm achieves an average saving of 51.3% (maximum 66.0%) in encoding time for intra coded frames, with negligible effect on PSNR and bit-rate.
electronic imaging | 2009
Wen-Fu Lee; Tai-Hsiang Huang; Yi-Hsin Huang; Mei-Lan Chu; Homer H. Chen
The saliency map is useful for many applications such as image compression, display, and visualization. However, the bottom-up model used in most saliency map construction methods is computationally expensive. The purpose of this paper is to improve the efficiency of the model for automatic construction of the saliency map of an image while preserving its accuracy. In particular, we remove the contrast sensitivity function and the visual masking component of the bottom-up visual attention model and retain the components related to perceptual decomposition and center-surround interaction that are critical properties of human visual system. The simplified model is verified by performance comparison with the ground truth. In addition, a salient region enhancement technique is adopted to enhance the connectivity of the saliency map, and the saliency maps of three color channels are fused to enhance the prediction accuracy. Experimental results show that the average correlation between our algorithm and the ground truth is close to that between the original model and the ground truth, while the computational complexity is reduced by 98%.
picture coding symposium | 2009
Yi-Hsin Huang; Tao-Sheng Ou; Homer H. Chen
Despite its remarkable coding efficiency, the computational complexity of the spatial-domain intra prediction scheme of H.264 remains a challenging issue especially for applications of the High Profile, as it incorporates the additional intra 8×8 prediction mode. To address this issue, we propose in this paper a fast selective intra mode decision algorithm that exploits a newly discovered property of inter-frame coding that the difference in rate-distortion cost between the best inter mode and the intra 16×16 mode is highly related to the type of the best coding mode. The proposed algorithm has high prediction accuracy, low computational overhead, and negligible effect on PSNR and bit-rate. Performance comparison is provided to show the superiority of the algorithm.
visual communications and image processing | 2011
Po-Yen Su; Yi-Hsin Huang; Tao-Sheng Ou; Homer H. Chen
The purpose of this technical demonstration is to show how the compressed video quality can be improved by taking the characteristics of human visual system into account in the rate distortion optimization and rate control processes of a video encoder. Both the R-D performance of the video encoder and the picture quality can be evaluated subjectively by the viewers in this demonstration to witness that perceptual video coding is indeed a promising direction for next generation video coding.