Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shih Hsiang Lin is active.

Publication


Featured researches published by Shih Hsiang Lin.


ACM Transactions on Asian Language Information Processing | 2009

A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization

Shih Hsiang Lin; Berlin Chen; Hsin-Min Wang

Extractive document summarization automatically selects a number of indicative sentences, passages, or paragraphs from an original document according to a target summarization ratio, and sequences them to form a concise summary. In this article, we present a comparative study of various probabilistic ranking models for spoken document summarization, including supervised classification-based summarizers and unsupervised probabilistic generative summarizers. We also investigate the use of unsupervised summarizers to improve the performance of supervised summarizers when manual labels are not available for training the latter. A novel training data selection approach that leverages the relevance information of spoken sentences to select reliable document-summary pairs derived by the probabilistic generative summarizers is explored for training the classification-based summarizers. Encouraging initial results on Mandarin Chinese broadcast news data are demonstrated.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

Leveraging Kullback–Leibler Divergence Measures and Information-Rich Cues for Speech Summarization

Shih Hsiang Lin; Yao Ming Yeh; Berlin Chen

Imperfect speech recognition often leads to degraded performance when exploiting conventional text-based methods for speech summarization. To alleviate this problem, this paper investigates various ways to robustly represent the recognition hypotheses of spoken documents beyond the top scoring ones. Moreover, a summarization framework, building on the Kullback-Leibler (KL) divergence measure and exploring both the relevance and topical information cues of spoken documents and sentences, is presented to work with such robust representations. Experiments on broadcast news speech summarization tasks appear to demonstrate the utility of the presented approaches.


IEEE Transactions on Audio, Speech, and Language Processing | 2012

A Risk-Aware Modeling Framework for Speech Summarization

Berlin Chen; Shih Hsiang Lin

Extractive speech summarization attempts to select a representative set of sentences from a spoken document so as to succinctly describe the main theme of the original document. In this paper, we adapt the notion of risk minimization for extractive speech summarization by formulating the selection of summary sentences as a decision-making problem. To this end, we develop several selection strategies and modeling paradigms that can leverage supervised and unsupervised summarization models to inherit their individual merits as well as to overcome their inherent limitations. On top of that, various component models are introduced, providing a principled way to render the redundancy and coherence relationships among sentences and between sentences and the whole document, respectively. A series of experiments on speech summarization seem to demonstrate that the methods deduced from our summarization framework are very competitive with existing summarization methods.


Information Processing and Management | 2013

Extractive speech summarization using evaluation metric-related training criteria

Berlin Chen; Shih Hsiang Lin; Yu Mei Chang; Jia Wen Liu

The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.


IEEE Transactions on Audio, Speech, and Language Processing | 2009

Exploring the Use of Speech Features and Their Corresponding Distribution Characteristics for Robust Speech Recognition

Shih Hsiang Lin; Berlin Chen; Yao Ming Yeh

The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Several methods have been proposed to improve ASR robustness over the last few decades. The related literature can be generally classified into two categories according to whether the methods are directly based on the feature domain or consider some specific statistical feature characteristics. In this paper, we present a polynomial regression approach that has the merit of directly characterizing the relationship between speech features and their corresponding distribution characteristics to compensate for noise interference. The proposed approach and a variant were thoroughly investigated and compared with a few existing noise robustness approaches. All experiments were conducted using the Aurora-2 database and task. The results show that our approaches achieve considerable word error rate reductions over the baseline system and are comparable to most of the conventional robustness approaches discussed in this paper.


ieee automatic speech recognition and understanding workshop | 2007

Training data selection for improving discriminative training of acoustic models

Shih Hung Liu; Fang Hui Chu; Shih Hsiang Lin; Hung Shin Lee; Berlin Chen

This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone-and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.


international conference on acoustics, speech, and signal processing | 2010

Leveraging evaluation metric-related training criteria for speech summarization

Shih Hsiang Lin; Yu Mei Chang; Jia Wen Liu; Berlin Chen

Many of the existing machine-learning approaches to speech summarization cast important sentence selection as a two-class classification problem and have shown empirical success for a wide variety of summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. On the other hand, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we hence investigate two different training criteria to alleviate the negative effects caused by them, as well as to boost the summarizers performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score. Experimental results on the broadcast news summarization task show that these two training criteria can give substantial improvements over the baseline SVM summarization system.


international conference on multimedia and expo | 2007

Investigating Data Selection for Minimum Phone Error Training of Acoustic Models

Shih Hung Liu; Fang Hui Chu; Shih Hsiang Lin; Berlin Chen

This paper considers minimum phone error (MPE) based discriminative training of acoustic models for Mandarin broadcast news recognition. A novel data selection approach based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance was explored. It has the merit of making the training algorithm focus much more on the training statistics of those frame samples that center nearly around the decision boundary for better discrimination. Moreover, we presented a new phone accuracy function based on the frame-level accuracy of hypothesized phone arcs instead of using the raw phone accuracy function of MPE training. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the original MPE training approach. Experiments conducted on the broadcast news collected in Taiwan showed that the integration of the frame-level data selection and accuracy calculation could achieve slight but consistent improvements over the baseline system.


ieee automatic speech recognition and understanding workshop | 2007

Investigating the use of speech features and their corresponding distribution characteristics for robust speech recognition

Shih Hsiang Lin; Yao Ming Yeh; Berlin Chen

The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness over the last few decades. Related work reported in the literature can be generally divided into two aspects according to whether the orientation of the methods is either from the feature domain or from the corresponding probability distributions. In this paper, we present a polynomial regression approach which has the merit of directly characterizing the relationship between the speech features and their corresponding probability distributions to compensate the noise effects. Two variants of the proposed approach are also extensively investigated as well. All experiments are conducted on the Aurora-2 database and task. Experimental results show that for clean-condition training, our approaches achieve considerable word error rate reductions over the baseline system, and also significantly outperform other conventional methods.


international conference on acoustics, speech, and signal processing | 2008

A comparative study of probabilistic ranking models for spoken document summarization

Shih Hsiang Lin; Yi Ting Chen; Hsin-Min Wang; Berlin Chen

The purpose of extractive document summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summarization ratio and then sequence them to form a concise summary. In the paper, we present a comparative study of various supervised and unsupervised probabilistic ranking models for spoken document summarization on the Chinese broadcast news. Moreover, we also investigate the possibility of using unsupervised summarizers to boost the performance of supervised summarizers when manual labels are not available for the training of supervised summarizers. Encouraging results were initially demonstrated.

Collaboration


Dive into the Shih Hsiang Lin's collaboration.

Top Co-Authors

Avatar

Berlin Chen

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Yao Ming Yeh

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fang Hui Chu

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Jia Wen Liu

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wei Hau Chen

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Yi Ting Chen

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar

Yu Mei Chang

National Taiwan Normal University

View shared research outputs
Researchain Logo
Decentralizing Knowledge