Shih Hung Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shih Hung Liu is active.

Explore More

Publication

Featured researches published by Shih Hung Liu.

IEEE Transactions on Audio, Speech, and Language Processing | 2015

Combining relevance language modeling and clarity measure for extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive speech summarization, which purports to select an indicative set of sentences from a spoken document so as to succinctly represent the most important aspects of the document, has garnered much research over the years. In this paper, we cast extractive speech summarization as an ad-hoc information retrieval (IR) problem and investigate various language modeling (LM) methods for important sentence selection. The main contributions of this paper are four-fold. First, we explore a novel sentence modeling paradigm built on top of the notion of relevance, where the relationship between a candidate summary sentence and a spoken document to be summarized is discovered through different granularities of context for relevance modeling. Second, not only lexical but also topical cues inherent in the spoken document are exploited for sentence modeling. Third, we propose a novel clarity measure for use in important sentence selection, which can help quantify the thematic specificity of each individual sentence that is deemed to be a crucial indicator orthogonal to the relevance measure provided by the LM-based methods. Fourth, in an attempt to lessen summarization performance degradation caused by imperfect speech recognition, we investigate making use of different levels of index features for LM-based sentence modeling, including words, subword-level units, and their combination. Experiments on broadcast news summarization seem to demonstrate the performance merits of our methods when compared to several existing well-developed and/or state-of-the-art methods.

ieee automatic speech recognition and understanding workshop | 2007

Training data selection for improving discriminative training of acoustic models

Shih Hung Liu; Fang Hui Chu; Shih Hsiang Lin; Hung Shin Lee; Berlin Chen

This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone-and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.

IEEE Transactions on Audio, Speech, and Language Processing | 2015

Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques

Kuan Yu Chen; Shih Hung Liu; Berlin Chen; Hsin-Min Wang; Ea Ee Jan; Wen-Lian Hsu; Hsin-Hsi Chen

Extractive text or speech summarization manages to select a set of salient sentences from an original document and concatenate them to form a summary, enabling users to better browse through and understand the content of the document. A recent stream of research on extractive summarization is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing speech summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and accurately estimate their parameters for each sentence in the document to be summarized. In view of this, our work in this paper explores a novel use of recurrent neural network language modeling (RNNLM) framework for extractive broadcast news summarization. On top of such a framework, the deduced sentence models are able to render not only word usage cues but also long-span structural information of word co-occurrence relationships within broadcast news documents, getting around the need for the strict bag-of-words assumption. Furthermore, different model complexities and combinations are extensively analyzed and compared. Experimental results demonstrate the performance merits of our summarization methods when compared to several well-studied state-of-the-art unsupervised methods.

international conference on multimedia and expo | 2007

Investigating Data Selection for Minimum Phone Error Training of Acoustic Models

Shih Hung Liu; Fang Hui Chu; Shih Hsiang Lin; Berlin Chen

This paper considers minimum phone error (MPE) based discriminative training of acoustic models for Mandarin broadcast news recognition. A novel data selection approach based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance was explored. It has the merit of making the training algorithm focus much more on the training statistics of those frame samples that center nearly around the decision boundary for better discrimination. Moreover, we presented a new phone accuracy function based on the frame-level accuracy of hypothesized phone arcs instead of using the raw phone accuracy function of MPE training. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the original MPE training approach. Experiments conducted on the broadcast news collected in Taiwan showed that the integration of the frame-level data selection and accuracy calculation could achieve slight but consistent improvements over the baseline system.

empirical methods in natural language processing | 2014

Leveraging Effective Query Modeling Techniques for Speech Recognition and Summarization

Kuan Yu Chen; Shih Hung Liu; Berlin Chen; Ea Ee Jan; Hsin-Min Wang; Wen-Lian Hsu; Hsin-Hsi Chen

Statistical language modeling (LM) that purports to quantify the acceptability of a given piece of text has long been an interesting yet challenging research area. In particular, language modeling for information retrieval (IR) has enjoyed remarkable empirical success; one emerging stream of the LM approach for IR is to employ the pseudo-relevance feedback process to enhance the representation of an input query so as to improve retrieval effectiveness. This paper presents a continuation of such a general line of research and the main contribution is threefold. First, we propose a principled framework which can unify the relationships among several widely-used query modeling formulations. Second, on top of the successfully developed framework, we propose an extended query modeling formulation by incorporating critical query-specific information cues to guide the model estimation. Third, we further adopt and formalize such a framework to the speech recognition and summarization tasks. A series of empirical experiments reveal the feasibility of such an LM framework and the performance merits of the deduced models on these two tasks.

international conference on multimedia and expo | 2014

A recurrent neural network language modeling framework for extractive speech summarization

Kuan Yu Chen; Shih Hung Liu; Berlin Chen; Hsin-Min Wang; Wen Lion Hsu; Hsin-Hsi Chen

Extractive speech summarization, with the purpose of automatically selecting a set of representative sentences from a spoken document so as to concisely express the most important theme of the document, has been an active area of research and development. A recent school of thought is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing speech summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and accurately estimate their parameters for each spoken document to be summarized. This paper presents a continuation of this general line of research and its contribution is two-fold. First, we propose a novel and effective recurrent neural network language modeling (RNNLM) framework for speech summarization, on top of which the deduced sentence models are able to render not only word usage cues but also long-span structural information of word co-occurrence relationships within spoken documents, getting around the need for the strict bag-of-words assumption. Second, the utilities of the method originated from our proposed framework and several widely-used unsupervised methods are analyzed and compared extensively. A series of experiments conducted on a broadcast news summarization task seem to demonstrate the performance merits of our summarization method when compared to several state-of-the-art existing unsupervised methods.

ieee automatic speech recognition and understanding workshop | 2015

Incorporating paragraph embeddings and density peaks clustering for spoken document summarization

Kuan Yu Chen; Kai Wun Shih; Shih Hung Liu; Berlin Chen; Hsin-Min Wang

Representation learning has emerged as a newly active research subject in many machine learning applications because of its excellent performance. As an instantiation, word embedding has been widely used in the natural language processing area. However, as far as we are aware, there are relatively few studies investigating paragraph embedding methods in extractive text or speech summarization. Extractive summarization aims at selecting a set of indicative sentences from a source document to express the most important theme of the document. There is a general consensus that relevance and redundancy are both critical issues for users in a realistic summarization scenario. However, most of the existing methods focus on determining only the relevance degree between sentences and a given document, while the redundancy degree is calculated by a post-processing step. Based on these observations, three contributions are proposed in this paper. First, we comprehensively compare the word and paragraph embedding methods for spoken document summarization. Next, we propose a novel summarization framework which can take both relevance and redundancy information into account simultaneously. Consequently, a set of representative sentences can be automatically selected through a one-pass process. Third, we further plug in paragraph embedding methods into the proposed framework to enhance the summarization performance. Experimental results demonstrate the effectiveness of our proposed methods, compared to existing state-of-the-art methods.

international conference on acoustics, speech, and signal processing | 2014

Effective pseudo-relevance feedback for language modeling in extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Yu Lun Hsieh; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive speech summarization, aiming to automatically select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has become an active area for research and experimentation. An emerging stream of work is to employ the language modeling (LM) framework along with the Kullback-Leibler divergence measure for extractive speech summarization, which can perform important sentence selection in an unsupervised manner and has shown preliminary success. This paper presents a continuation of such a general line of research and its main contribution is two-fold. First, by virtue of pseudo-relevance feedback, we explore several effective sentence modeling formulations to enhance the sentence models involved in the LM-based summarization framework. Second, the utilities of our summarization methods and several widely-used methods are analyzed and compared extensively, which demonstrates the effectiveness of our methods.

Speech Communication | 2016

Exploring the use of unsupervised query modeling techniques for speech recognition and summarization

Kuan Yu Chen; Shih Hung Liu; Berlin Chen; Hsin-Min Wang; Hsin-Hsi Chen

Statistical language modeling (LM) that intends to quantify the acceptability of a given piece of text has long been an interesting yet challenging research area. In particular, language modeling for information retrieval (IR) has enjoyed remarkable empirical success; one emerging stream of the LM approach for IR is to employ the pseudo-relevance feedback process to enhance the representation of an input query so as to improve retrieval effectiveness. This paper presents a continuation of such a general line of research and the major contributions are three-fold. First, we propose a principled framework which can unify the relationships among several widely-cited query modeling formulations. Second, on top of this successfully developed framework, two extensions have been proposed. On one hand, we present an extended query modeling formulation by incorporating critical query-specific information cues to guide the model estimation. On the other hand, a word-based relevance modeling has also been leveraged to overcome the obstacle of time-consuming model estimation when the framework is being utilized for practical applications. In addition, we further adopt and formalize such a framework to the speech recognition and summarization tasks. A series of experiments reveal the empirical potential of such an LM framework and the performance merits of the deduced models on these two tasks.

international conference on technologies and applications of artificial intelligence | 2015

Distributed keyword vector representation for document categorization

Yu Lun Hsieh; Shih Hung Liu; Yung Chun Chang; Wen-Lian Hsu

In the age of information explosion, efficiently categorizing the topic of a document can assist our organization and comprehension of the vast amount of text. In this paper, we propose a novel approach, named DKV, for document categorization using distributed real-valued vector representation of keywords learned from neural networks. Such a representation can project rich context information (or embedding) into the vector space, and subsequently be used to infer similarity measures among words, sentences, and even documents. Using a Chinese news corpus containing over 100,000 articles and five topics, we provide a comprehensive performance evaluation to demonstrate that by exploiting the keyword embeddings, DKV paired with support vector machines can effectively categorize a document into the predefined topics. Results demonstrate that our method can achieve the best performances compared to several other approaches.

Explore More