Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hsu Chun Yen is active.

Publication


Featured researches published by Hsu Chun Yen.


IEEE Transactions on Audio, Speech, and Language Processing | 2015

Combining relevance language modeling and clarity measure for extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive speech summarization, which purports to select an indicative set of sentences from a spoken document so as to succinctly represent the most important aspects of the document, has garnered much research over the years. In this paper, we cast extractive speech summarization as an ad-hoc information retrieval (IR) problem and investigate various language modeling (LM) methods for important sentence selection. The main contributions of this paper are four-fold. First, we explore a novel sentence modeling paradigm built on top of the notion of relevance, where the relationship between a candidate summary sentence and a spoken document to be summarized is discovered through different granularities of context for relevance modeling. Second, not only lexical but also topical cues inherent in the spoken document are exploited for sentence modeling. Third, we propose a novel clarity measure for use in important sentence selection, which can help quantify the thematic specificity of each individual sentence that is deemed to be a crucial indicator orthogonal to the relevance measure provided by the LM-based methods. Fourth, in an attempt to lessen summarization performance degradation caused by imperfect speech recognition, we investigate making use of different levels of index features for LM-based sentence modeling, including words, subword-level units, and their combination. Experiments on broadcast news summarization seem to demonstrate the performance merits of our methods when compared to several existing well-developed and/or state-of-the-art methods.


information reuse and integration | 2008

An alignment-based surface pattern for a question answering system

Cheng-Lung Sung; Cheng-Wei Lee; Hsu Chun Yen; Wen-Lian Hsu

In this paper, we propose an alignment-based surface pattern approach, called ABSP, which integrates semantic information into syntactic patterns for question answering (QA). ABSP uses surface patterns to extract important terms from questions, and constructs the terms’ relations from sentences in the corpus. The relations are then used to filter appropriate answer candidates. Experiments show that ABSP can achieve high accuracy and can be incorporated into other QA systems that have high coverage. It can also be used in cross-lingual QA systems. The approach is both robust and portable to other domains.


intelligent systems design and applications | 2008

Compute the Term Contributed Frequency

Cheng-Lung Sung; Hsu Chun Yen; Wen-Lian Hsu

In this paper, we propose an algorithm and data structure for computing the term contributed frequency (tcf) for all N-grams in a text corpus. Although term frequency is one of the standard notions of frequency in Corpus-Based Natural Language Processing (NLP), there are some problems regarding the use of the concept to N-grams approaches such as the distortion of phrase frequencies. We attempt to overcome this drawback by building a DAG containing the proposed data structure and using it to retrieve more reliable term frequencies. Our proposed algorithm and data structure are more efficient than traditional term frequency extraction approaches and portable to various languages.


international conference on acoustics, speech, and signal processing | 2014

Effective pseudo-relevance feedback for language modeling in extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Yu Lun Hsieh; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive speech summarization, aiming to automatically select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has become an active area for research and experimentation. An emerging stream of work is to employ the language modeling (LM) framework along with the Kullback-Leibler divergence measure for extractive speech summarization, which can perform important sentence selection in an unsupervised manner and has shown preliminary success. This paper presents a continuation of such a general line of research and its main contribution is two-fold. First, by virtue of pseudo-relevance feedback, we explore several effective sentence modeling formulations to enhance the sentence models involved in the LM-based summarization framework. Second, the utilities of our summarization methods and several widely-used methods are analyzed and compared extensively, which demonstrates the effectiveness of our methods.


asia pacific signal and information processing association annual summit and conference | 2014

A margin-based discriminative modeling approach for extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Berlin Chen; Ea Ee Jan; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

The task of extractive speech summarization is to select a set of salient sentences from an original spoken document and concatenate them to form a summary, facilitating users to better browse through and understand the content of the document. In this paper we present an empirical study of leveraging various supervised discriminative methods for effectively ranking important sentences of a spoken document to be summarized. In addition, we propose a novel margin-based discriminative training (MBDT) algorithm that aims to penalize non-summary sentences in an inverse proportion to their summarization evaluation scores, leading to better discrimination from the desired summary sentences. By doing so, the summarization model can be trained with an objective function that is closely coupled with the ultimate evaluation metric of extractive speech summarization. Furthermore, sentences of spoken documents are embodied by a wide range of prosodie, lexical and relevance features, whose utilities are extensively compared and analyzed. Experiments conducted on a Mandarin broadcast news summarization task demonstrate the performance merits of our summarization method when compared to several well-studied state-of-the-art supervised and unsupervised methods.


acm transactions on asian and low resource language information processing | 2017

A Position-Aware Language Modeling Framework for Extractive Broadcast News Speech Summarization

Shih Hung Liu; Kuan Yu Chen; Yu Lun Hsieh; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive summarization, a process that automatically picks exemplary sentences from a text (or spoken) document with the goal of concisely conveying key information therein, has seen a surge of attention from scholars and practitioners recently. Using a language modeling (LM) approach for sentence selection has been proven effective for performing unsupervised extractive summarization. However, one of the major difficulties facing the LM approach is to model sentences and estimate their parameters more accurately for each text (or spoken) document. We extend this line of research and make the following contributions in this work. First, we propose a position-aware language modeling framework using various granularities of position-specific information to better estimate the sentence models involved in the summarization process. Second, we explore disparate ways to integrate the positional cues into relevance models through a pseudo-relevance feedback procedure. Third, we extensively evaluate various models originated from our proposed framework and several well-established unsupervised methods. Empirical evaluation conducted on a broadcast news summarization task further demonstrates performance merits of the proposed summarization methods.


conference of the international speech communication association | 2016

Exploring word Mover's distance and semantic-aware embedding techniques for extractive broadcast news summarization

Shih Hung Liu; Kuan Yu Chen; Yu Lun Hsieh; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive summarization is a process that manages to select the most salient sentences from a document (or a set of documents) and subsequently assemble them to form an informative summary, facilitating users to browse and assimilate the main theme of the document efficiently. Our work in this paper continues this general line of research and its main contributions are two-fold. First, we explore to leverage the recently proposed word mover’s distance (WMD) metric, in conjunction with semantic-aware continuous space representations of words, to authentically capture finer-grained sentence-to-document and/or sentence-to-sentence semantic relatedness for effective use in the summarization process. Second, we investigate to combine our proposed approach with several state-of-the-art summarization methods, which originally adopted the conventional term-overlap or bag-ofwords (BOW) approaches for similarity calculation. A series of experiments conducted on a typical broadcast news summarization task seem to suggest the performance merits of our proposed approach, in comparison to the mainstream methods.


asia pacific signal and information processing association annual summit and conference | 2016

Exploiting graph regularized nonnegative matrix factorization for extractive speech summarization

Shih Hung Liu; Kuan Yu Chen; Yu Lun Hsieh; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive summarization systems attempt to automatically pick out representative sentences from a source text or spoken document and concatenate them into a concise summary so as to help people grasp salient information effectively and efficiently. Recent advances in applying nonnegative matrix factorization (NMF) on various tasks including summarization motivate us to extend this line of research and provide the following contributions. First, we propose to employ graph-regularized nonnegative matrix factorization (GNMF), in which an affinity graph with its similarity measure tailored to the evaluation metric of summarization is constructed and in turn serves as a neighborhood preserving constraint of NMF, so as to better represent the semantic space of sentences in the document to be summarized. Second, we further consider sparsity and orthogonality constraints on NMF and GNMF for better selection of representative sentences to form a summary. Extensive experiments conducted on a Mandarin broadcast news speech dataset demonstrate the effectiveness of the proposed unsupervised summarization models, in relation to several widely-used state-of-the-art methods compared in the paper.


asia pacific signal and information processing association annual summit and conference | 2015

Incorporating proximity information in relevance language modeling for extractive speech summarization

Shih Hung Liu; Hung Shih Lee; Hsiao Tsung Hung; Kuan Yu Chen; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Extractive speech summarization refers to automatic selection of an indicative set of sentences from a spoken document so as to offer a concise digest covering the most salient aspects of the original document. The language modeling (LM) framework alongside the pseudo-relevance feedback (PRF) technique has emerged as a promising line of research for conducting extractive speech summarization in an unsupervised manner, showing some preliminary success. This paper extends such a general line of research and its main contributions are two-fold. First, we explore several effective formulations of proximity-based cues for use in the sentence modeling process involved in the LM-based summarization framework. Second, the utilities of the methods instantiated from the LM-based summarization framework and several well-practiced state-of-the-art methods are analyzed and compared extensively. The empirical results suggest the effectiveness of our methods.


conference of the international speech communication association | 2015

Positional language modeling for extractive broadcast news speech summarization

Shih Hung Liu; Kuan Yu Chen; Berlin Chen; Hsin-Min Wang; Hsu Chun Yen; Wen-Lian Hsu

Collaboration


Dive into the Hsu Chun Yen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Berlin Chen

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hsiao Tsung Hung

National Taiwan Normal University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge