Chia-Hsin Hsieh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chia-Hsin Hsieh is active.

Explore More

Publication

Featured researches published by Chia-Hsin Hsieh.

IEEE Transactions on Audio, Speech, and Language Processing | 2006

Multiple change-point audio segmentation and classification using an MDL-based Gaussian model

Chung-Hsien Wu; Chia-Hsin Hsieh

This study presents an approach for segmenting and classifying an audio stream based on audio type. First, a silence deletion procedure is employed to remove silence segments in the audio stream. A minimum description length (MDL)-based Gaussian model is then proposed to statistically characterize the audio features. Audio segmentation segments the audio stream into a sequence of homogeneous subsegments using the MDL-based Gaussian model. A hierarchical threshold-based classifier is then used to classify each subsegment into different audio types. Finally, a heuristic method is adopted to smooth the subsegment sequence and provide the final segmentation and classification results. Experimental results indicate that for TDT-3 news broadcast, a missed detection rate (MDR) of 0.1 and a false alarm rate (FAR) of 0.14 were achieved for audio segmentation. Given the same MDR and FAR values, segment-based audio classification achieved a better classification accuracy of 88% compared to a clip-based approach.

IEEE Transactions on Audio, Speech, and Language Processing | 2009

Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm

Chung-Hsien Wu; Chia-Hsin Hsieh

This paper presents a two-stage approach to story segmentation and topic classification of broadcast news. The two-stage paradigm adopts a decision tree and a maximum entropy model to identify the potential story boundaries in the broadcast news within a sliding window. The problem for story segmentation is thus transformed to the determination of a boundary position sequence from the potential boundary regions. A genetic algorithm is then applied to determine the chromosome, which corresponds to the final boundary position sequence. A topic-based segmental model is proposed to define the fitness function applied in the genetic algorithm. The syllable- and word-based story segmentation schemes are adopted to evaluate the proposed approach. Experimental results indicate that a miss probability of 0.1587 and a false alarm probability of 0.0859 are achieved for story segmentation on the collected broadcast news corpus. On the TDT-3 Mandarin audio corpus, a miss probability of 0.1232 and a false alarm probability of 0.1298 are achieved. Moreover, an outside classification accuracy of 74.55% is obtained for topic classification on the collected broadcast news, while an inside classification accuracy of 88.82% is achieved on the TDT-2 Mandarin audio corpus.

international symposium on chinese spoken language processing | 2004

Spoken document summarization using topic-related corpus and semantic dependency grammar

Chia-Hsin Hsieh; Chien-Lin Huang; Chung-Hsien Wu

The paper presents a spoken document summarization scheme using a topic-related corpus and semantic dependency grammar. The summarization score considers speech recognition confidence, word significance, word trigram, semantic dependency grammar (SDG) and probabilistic context free grammar (PCFG). In addition, a topic-related corpus consisting of keywords as well as articles is used to estimate the word significance score using latent semantic indexing (LSI). Semantic relations between words are determined by SDG using HowNet and Sinica Treebank. A dynamic programming algorithm is applied to decide the summarization ratio and look for the best summarization result according to summarization scores. Experimental results indicate that the proposed approach effectively extracts important words with semantic dependency and gives a promising speech summary.

IEEE Transactions on Multimedia | 2007

Speech Sentence Compression Based on Speech Segment Extraction and Concatenation

Chung-Hsien Wu; Chia-Hsin Hsieh; Chien-Lin Huang

This correspondence presents a speech sentence compression scheme. A compressed word sequence is first extracted. Speech segments, in the spoken document, corresponding to the extracted words are selected for concatenation. Evaluation of the proposed approach shows the compressed speech sentence retains important and meaningful information and naturalness

Speech Communication | 2008

Stochastic vector mapping-based feature enhancement using prior-models and model adaptation for noisy speech recognition

Chia-Hsin Hsieh; Chung-Hsien Wu

This paper presents an approach to feature enhancement for noisy speech recognition. Three prior-models are introduced to characterize clean speech, noise and noisy speech, respectively. Sequential noise estimation is employed for prior-model construction based on noise-normalized stochastic vector mapping. Therefore, feature enhancement can work without stereo training data and manual tagging of background noise type based on the auto-clustering on the estimated noise data. Environment model adaptation is also adopted to reduce the mismatch between training data and test data. For the evaluation on the AURORA2 database, the experimental results indicate that a 9.6% relative reduction in digit error rate for multi-condition training and a 3.5% relative reduction in digit error rate for clean speech training were achieved without stereo training data compared to the SPLICE-based approach. For MATBN Mandarin broadcast news database with multi-condition training, a 13% relative reduction in syllable error rate for anchor speech, a 12% relative reduction in syllable error rate for field reporter speech and a 7% relative reduction in syllable error rate for interviewee speech were obtained compared to the MCE-based approach.

international conference on multimedia and expo | 2008

Unsupervised pronunciation grammar growing using knowledge-based and data-driven approaches

Chien-Lin Huang; Chung-Hsien Wu; Haizhou Li; Chia-Hsin Hsieh; Bin Ma

This study presents a novel approach to unsupervised pronunciation grammar growing for non-native speech recognition. Unsupervised pronunciation grammar growing includes pronunciation variation graph construction and non-native grammar generation. Knowledge-based and data-driven approaches are considered for variation graph construction. The measurement of confidence and support is used for grammar selection. Experiments show that unsupervised pronunciation grammar growing is suitable for the improvement of non-native speech recognition.

international conference on wireless networks | 2005

Spoken document summarization and retrieval for wireless application

Chung-Hsien Wu; Chien-Lin Huang; Chia-Hsin Hsieh

For the purpose of wireless data transformation, spoken document summarization can efficiently reduce the redundant contents. This study presents a voice-activated spoken document summarization and retrieval scheme using text and speech analysis. In this method, prosody, speech recognition confidence, word significance, word trigram and semantic dependency are considered in the summarization score. A dynamic programming algorithm is used to seek the best summarization result. Experimental results indicate that the proposed approach effectively summarizes concise spoken sentences containing important words with semantic dependency.

Archive | 2009