Jau-Ling Shih
Chung Hua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jau-Ling Shih.
Pattern Recognition | 2007
Jau-Ling Shih; Chang-Hsing Lee; Jian Tang Wang
The advances in 3D data acquisition techniques, graphics hardware, and 3D data modeling and visualizing techniques have led to the proliferation of 3D models. This has made the searching for specific 3D models a vital issue. Techniques for effective and efficient content-based retrieval of 3D models have therefore become an essential research topic. In this paper, a novel feature, called elevation descriptor, is proposed for 3D model retrieval. The elevation descriptor is invariant to translation and scaling of 3D models and it is robust for rotation. First, six elevations are obtained to describe the altitude information of a 3D model from six different views. Each elevation is represented by a gray-level image which is decomposed into several concentric circles. The elevation descriptor is obtained by taking the difference between the altitude sums of two successive concentric circles. An efficient similarity matching method is used to find the best match for an input model. Experimental results show that the proposed method is superior to other descriptors, including spherical harmonics, the MPEG-7 3D shape spectrum descriptor, and D2.
IEEE Transactions on Multimedia | 2009
Chang-Hsing Lee; Jau-Ling Shih; Kun-Ming Yu; Hwai-San Lin
In this paper, we will propose an automatic music genre classification approach based on long-term modulation spectral analysis of spectral (OSC and MPEG-7 NASE) as well as cepstral (MFCC) features. Modulation spectral analysis of every feature value will generate a corresponding modulation spectrum and all the modulation spectra can be collected to form a modulation spectrogram which exhibits the time-varying or rhythmic information of music signals. Each modulation spectrum is then decomposed into several logarithmically-spaced modulation subbands. The modulation spectral contrast (MSC) and modulation spectral valley (MSV) are then computed from each modulation subband. Effective and compact features are generated from statistical aggregations of the MSCs and MSVs of all modulation subbands. An information fusion approach which integrates both feature level fusion method and decision level combination method is employed to improve the classification accuracy. Experiments conducted on two different music datasets have shown that our proposed approach can achieve higher classification accuracy than other approaches with the same experimental setup.
international conference on multimedia and expo | 2007
Chang-Hsing Lee; Jau-Ling Shih; Kun-Ming Yu; Jung-Mau Su
In this paper, we proposed a novel feature, called octave-based modulation spectral contrast (OMSC), for music genre classification. OMSC is extracted from long-term modulation spectrum analysis to represent the time-varying behavior of music signals. Experimental results have shown that OMSC outperforms MFCC and OSC. If OMSC is integrated with MFCC and OSC, the classification accuracy is 84.03% for seven music genre classification.
IEEE Transactions on Multimedia | 2013
Chang-Hsing Lee; Sheng-Bin Hsu; Jau-Ling Shih; Chih-Hsun Chou
Traditional birdsong recognition approaches used acoustic features based on the acoustic model of speech production or the perceptual model of the human auditory system to identify the associated bird species. In this paper, a new feature descriptor that uses image shape features is proposed to identify bird species based on the recognition of fixed-duration birdsong segments where their corresponding spectrograms are viewed as gray-level images. The MPEG-7 angular radial transform (ART) descriptor, which can compactly and efficiently describe the gray-level variations within an image region in both angular and radial directions, will be employed to extract the shape features from the spectrogram image. To effectively capture both frequency and temporal variations within a birdsong segment using ART, a sector expansion algorithm is proposed to transform its spectrogram image into a corresponding sector image such that the frequency and temporal axes of the spectrogram image will align with the radial and angular directions of the ART basis functions, respectively. For the classification of 28 bird species using Gaussian mixture models (GMM), the best classification accuracy is 86.30% and 94.62% for 3-second and 5-second birdsong segments using the proposed ART descriptor, which is better than traditional descriptors such as LPCC, MFCC, and TDMFCC.
signal-image technology and internet-based systems | 2013
Chang-Hsing Lee; Jau-Ling Shih; Cheng-Chang Lien; Chin-Chuan Han
Many people will use digital cameras and camera phones to take images. However, the visual quality (contrast, color rendition, etc.) of some acquired images may be poor due to the limitation of capturing devices or improper illumination conditions, particularly in wide dynamic range scenes. Thus, the images generally consist of both overexposed and underexposed areas. Conventional image enhancement methods may either fail to produce satisfactory and undistorted images, or cannot improve every region of interest appropriately. Single-scale retinex (SSR) and multiscale retinex (MSR), which is defined as a weighted sum of several SSRs, were developed for local image contrast enhancement and dynamic range compression. In this paper, an adaptive multiscale retinex (AMSR) approach will be proposed for image contrast enhancement. In AMSR, the weight associated with each SSR output image is adaptively computed according to the content of the input image in order to produce an enhanced image with natural impression and proper tonal rendition in every region of the image. Experimental results on several low contrast images have shown that our proposed AMSR approach can produce natural and appealing enhanced images.
intelligent information hiding and multimedia signal processing | 2009
Chang-Hsing Lee; Hwai-San Lin; Chih-Hsun Chou; Jau-Ling Shih
In this paper, we will propose an automatic music genre classification approach based on long-term modulation spectral analysis on the static and transitional information of spectral (OSC and MPEG-7 NASE) as well as cepstral (MFCC) features. An information fusion approach which integrates both feature level fusion and decision level combination is employed to improve the classification accuracy. Experiments conducted on the music database employed in the ISMIR2004 Audio Description Contest have shown that the proposed approach can achieve a classification accuracy of 87.79%, which is better than the winner of the contest.
international conference on acoustics, speech, and signal processing | 2012
Chang-Hsing Lee; Jau-Ling Shih; Chih-Hsun Chou; Kung-Ming Yu; Chuan-Yen Hung
In this paper, we will propose a 3D model retrieval approach using 2D cepstral features. First, six projection planes representing the elevation (depth) value are generated. Then, 2D cepstral features are extracted from each projection plane for searching similar 3D models. Experiments conducted on the Princeton Shape Benchmark (PSB) database have shown that the proposed 2D cepstral features outperforms other state-of-the-art descriptors in terms of the DCG score.
broadband and wireless computing, communication and applications | 2010
Jau-Ling Shih; Chang-Hsing Lee; Chih-Hsun Chou; Hsiang-Yuen Chang
In recent years, the demand for a content-based 3D model retrieval system becomes an important issue. In this paper, the cylindrical projection descriptor (CPD) will be proposed for 3D model retrieval. To derive better retrieval results, the CPD will be combined with the radial distance descriptor (RDD). The experiments are conducted on the Princeton Shape Benchmark (PSB) database. Experiment results show that our proposed method is superior to others.
international conference on model transformation | 2011
Chang-Hsing Lee; Jau-Ling Shih; Kun-Ming Yu; Hsiang-Yuen Chang; Yih-Chih Chiou
In this paper, the combination of different projected shape features is proposed for 3D model retrieval. The projection features include the elevation value (depth), the radial distance, and the angle of a surface mesh. For each of the characteristic values (elevation value, radial distance, and angle value), six projection planes represented as gray-level images will be generated. The MPEG-7 angular radial transform (ART) is then used to compute the feature vector from each projection plane. Experiments conducted on the Princeton Shape Benchmark (PSB) database have shown that the proposed approach outperforms the state-of-the-art descriptors in terms of the DCG score.
asia-pacific services computing conference | 2008
Chang-Hsing Lee; Jau-Ling Shih; Kun-Ming Yu; Hwai-San Lin; Ming-Hui Wei
In this paper, an automatic music genre classification approach which integrates the features derived from static and transitional information of cepstral (MFCC) and spectral (OSC) features will be proposed. MFCC and OSC capture the characteristics of one audio frame. Therefore, the transitional information, including delta-MFCC, delta-OSC, delta-delta-MFCC, and delta-delta-OSC, are then extracted and combined with MFCC and OSC to improve the classification accuracy. Two information fusion techniques, including feature level fusion and decision level fusion, are developed to combine the extracted feature vectors. Experiments conducted on the music database employed in the ISMIR2004 Audio Description Contest have shown that the proposed approach can achieve a classification accuracy of 84.23%, which is better than the winner of the ISMIR2004 music genre classification contest.