Shih-Hsuan Yang
National Taipei University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shih-Hsuan Yang.
international conference on multimedia and expo | 2008
Bo-Yuan Chen; Shih-Hsuan Yang
The support of variable block-sizes, though significantly improves the compression efficiency, imposes a big computational challenge on an H.264 encoder. In this paper, we present a new fast inter-mode decision algorithm based on the coded block pattern (CBP). Specifying which blocks within a macroblock are uncoded, CBP is a good indicator of block-matching accuracy and can be used to reduce the rate-distortion calculation required for mode selection. Experimental results show that the proposed method achieves more than 50% reduction in computation time relative to the exhaustive mode search with negligible degradation in visual quality. Other merits of the proposed method include (1) no extra computation is needed for obtaining CBP; (2) no ad-hoc threshold is involved in the algorithm; (3) in contrast to the previous early-skip algorithm that yields very limited improvement for high-activity videos, the proposed method is highly reliable for various video characteristics.
IEEE Transactions on Circuits and Systems for Video Technology | 2007
Shih-Hsuan Yang; Po-Feng Cheng
A novel hybrid error-resilience and error-concealment technique for embedded wavelet coders is presented. Aimed to resolve data loss in real-time visual transmission over the packet erasure channels, the proposed method incorporates data partitioning and multiple description coding into the set partitioning in hierarchical tree (SPIHTs) encoding process. Each of the spatial-orientation trees of SPIHT is independently coded and packetized with multiple descriptions of important wavelet coefficients. At decoding, the coefficients that cannot be recovered are predicted through linear interpolation. The estimation is based on either intraband or inter-band correlation among wavelet coefficients. Experimental results show that the proposed method achieves good and stable error performance with low additional redundancy
systems, man and cybernetics | 2006
Shih-Hsuan Yang; Fu-Min Jheng
Previous digital image stabilization (DIS) techniques perform hand-shake estimation independently on the horizontal and vertical directions. In this paper, we propose a new DIS algorithm that adaptively compensates hand jitter in a two-dimensional framework. The proposed method also includes improved validation criteria for motion vectors. Performance evaluation of the system is conduced based on the motion diagram and the PSNR improvement when it is used in conjunction with an H.264 codec. Experimental results substantiate the superiority of the proposed system.
international conference on information technology research and education | 2005
Shih-Hsuan Yang; Chin-Feng Chen
Media hashing is an important resolving skill of copyright infringement. In this paper, a novel robust image hashing scheme is proposed. The set partitioning in hierarchical trees (SPIHT) algorithm, widely employed in image compression, is used to extract the identification information of images. The sorting pass of SPIHT records the spatial distribution of significant wavelet coefficients, termed the significance map. We build the hash values from the significance maps and the associated autocorrelograms. To verify the robustness of the proposed methods, experiments are conducted on the Stir mark benchmarking system. The proposed hash sequence in autocorrelogram achieves good performance with reasonable complexity.
international conference on acoustics, speech, and signal processing | 2003
Shih-Hsuan Yang
Efficient image watermarking techniques have been developed in the wavelet domain. Similar to other wavelet-based image processing, the choice of wavelet filters generally affects the performance of a wavelet-based watermarking system. In this paper, we evaluate the performance of a set of biorthogonal integer wavelets under a multiresolution-watermarking framework. Biorthogonal integer wavelets have been extensively used for image applications because they possess the linear-phase property and can be efficiently implemented. We find that the widely adopted 9/7-F wavelet achieves the best robustness performance. Further investigation is conducted to show that the superiority of the 9/7-F wavelet is primarily owing to its being nearly orthogonal.
advances in multimedia | 2007
Dong-Woei Lin; Shih-Hsuan Yang
In this paper, we propose a new technique for extracting salient regions in an image. Identification of salient regions is useful for region/object based image processing. Previous works on salient regions/points typically involve complex detection and are not always reliable in terms of perceptual importance and robustness. This paper presents an efficient salient-region extraction algorithm based on the significance of accumulated wavelet coefficients. The proposed method is robust to common image processing such as compression, filtering, and geometric distortions. Experimental results substantiate the distinguished performance of the proposed method.
international conference on multimedia and expo | 2001
Shih-Hsuan Yang; Yi-Lan Chang; Hsin-Chang Chen
The SPIHT (set partitioning in hierarchical trees) method proposed by Said and Pearlman is a very successful image compression algorithm, and has become the core technology of MPEG-4 and JPEG-2000. In this paper, we develop a compression-domain watermarking technique based on the SPIHT coding. In contrast of the conventional approaches that incorporate watermarks into the transformed coefficients, the proposed method impresses the binary watermarking sequence directly on the bitstream generated in the quantization process. The security of the system is assured by the pseudo-random nature of the watermarking sequence and the locations it resides in. The performance of the system can be enhanced with minimal complexity by a joint optimization of quantization and watermarking. Experiments show that our method survives the standard attacks, including the JPEG lossy compression, rotation with auto-scaling, and the StirMark random geometric distortion.
asia pacific conference on circuits and systems | 2002
Shih-Hsuan Yang; Hsin-Chang Chen
In this paper, we develop a robust bit-plane watermarking technique based on zerotree coding. A robust watermark is an imperceptible but indelible code that can be used for ownership identification. Zerotree-based encoders are popular owing to their high coding efficiency, low coding complexity, and the capability of generating scalable bitstreams. Efficient image compression and watermarking algorithms use similar techniques to reduce the introduced visual artifacts. In contrast to the conventional approaches where compression and watermarking are treated independently, the proposed method integrates these two operations. The watermark is inserted by directly modifying the output bitstream of the zerotree-based quantizer. Since the watermarking framework is tightly coupled with compression, we can avoid adding watermark information to the coefficients that would be susceptible to quantization errors. Furthermore, this compressed-domain watermarking framework may reduce the systems complexity since watermark embedding and retrieval may be performed directly in the compressed domain. Security of the system relies on the secure keys that determine the watermark signal and watermarking locations. Complete evaluation is conducted against both noise-type and geometric attacks. Experimental results show that our method survives the aforementioned attacks and the performance is competitive with or better than transform-domain approaches.
visual communications and image processing | 2011
Shih-Hsuan Yang; Cyong-Wun Fan; Yu-Cheng Chen
TV commercials that intervene between program segments are essential to TV broadcasters and are undesirable for typical TV audience. Detection of TV commercials is thus important for many applications. In this paper, we propose a new TV commercial detection algorithm for digital TV broadcasting in Taiwan. The proposed method integrates audio and video features. The employed visual characteristics include the still pictures, program title, and rating logos. The employed audio characteristics include short silence, starting music, and ending music. The scenarios for which program segment ends (commercial starts) and program segment starts (commercial ends) are carefully analyzed to facilitate joint audio-visual detection. Experiments show that the proposed method can detect all commercial intervals with no or little time error.
international conference on multimedia and expo | 2010
Shih-Hsuan Yang; Yu-Shiuan Liou
Multiview video coding (MVC) is essential to many 3D video applications. Because MVC inherently contain more visual data and may employ inter-view prediction additional to temporal prediction the involved computational complexity is much higher than conventional single-view encoders. Fast algorithms are thus very desirable for the practical use of MVC. In this paper, we use coded block patterns, mode correlation among adjacent macroblocks, and RDcost comparison for fast selection of the best reference frame and inter-mode. Special values of coded block patterns, which indicate the accuracy of 16×16 motion estimation of a macroblock, are used to identify whether larger partitioning blocks (DIRECT, 16×16, 16×8, or 8×16) are suitable. The coded block pattern is also used to determine if the subsequent inter-view prediction is required. Mode correlation and RDcost comparison are later employed to select other macroblocks that are more probable to use larger partitioning blocks. The experimental results show that the proposed algorithm can reduce up to 90% encoding time with un-noticeable degradation in visual quality.