Jen Yuan Yeh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jen Yuan Yeh is active.

Explore More

Publication

Featured researches published by Jen Yuan Yeh.

Information Processing and Management | 2005

Text summarization using a trainable summarizer and latent semantic analysis

Jen Yuan Yeh; Hao Ren Ke; Wei-Pang Yang; I-Heng Meng

This paper proposes two approaches to address text summarization: modified corpus-based approach (MCBA) and LSA-based T.R.M. approach (LSA + T.R.M.). The first is a trainable summarizer, which takes into account several features, including position, positive keyword, negative keyword, centrality, and the resemblance to the title, to generate summaries. Two new ideas are exploited: (1) sentence positions are ranked to emphasize the significances of different sentence positions, and (2) the score function is trained by the genetic algorithm (GA) to obtain a suitable combination of feature weights. The second uses latent semantic analysis (LSA) to derive the semantic matrix of a document or a corpus and uses semantic sentence representation to construct a semantic text relationship map. We evaluate LSA + T.R.M. both with single documents and at the corpus level to investigate the competence of LSA in text summarization. The two novel approaches were measured at several compression rates on a data corpus composed of 100 political articles. When the compression rate was 30%, an average f-measure of 49% for MCBA, 52% for MCBA + GA, 44% and 40% for LSA + T.R.M. in single-document and corpus level were achieved respectively.

Expert Systems With Applications | 2008

iSpreadRank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network

Jen Yuan Yeh; Hao Ren Ke; Wei-Pang Yang

Sentence extraction is a widely adopted text summarization technique where the most important sentences are extracted from document(s) and presented as a summary. The first step towards sentence extraction is to rank sentences in order of importance as in the summary. This paper proposes a novel graph-based ranking method, iSpreadRank, to perform this task. iSpreadRank models a set of topic-related documents into a sentence similarity network. Based on such a network model, iSpreadRank exploits the spreading activation theory to formulate a general concept from social network analysis: the importance of a node in a network (i.e., a sentence in this paper) is determined not only by the number of nodes to which it connects, but also by the importance of its connected nodes. The algorithm recursively re-weights the importance of sentences by spreading their sentence-specific feature scores throughout the network to adjust the importance of other sentences. Consequently, a ranking of sentences indicating the relative importance of sentences is reasoned. This paper also develops an approach to produce a generic extractive summary according to the inferred sentence ranking. The proposed summarization method is evaluated using the DUC 2004 data set, and found to perform well. Experimental results show that the proposed method obtains a ROUGE-1 score of 0.38068, which represents a slight difference of 0.00156, when compared with the best participant in the DUC 2004 evaluation.

international conference on asian digital libraries | 2002

Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis

Jen Yuan Yeh; Hao Ren Ke; Wei-Pang Yang

In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.

cross language evaluation forum | 2004

Comparison and combination of textual and visual features for interactive cross-language image retrieval

Pei-Cheng Cheng; Jen Yuan Yeh; Hao Ren Ke; Been-Chian Chien; Wei-Pang Yang

This paper concentrates on the user-centered search task at ImageCLEF 2004. In this work, we combine both textual and visual features for cross-language image retrieval, and propose two interactive retrieval systems – T_ICLEF and VCT_ICLEF. The first one incorporates a relevance feedback mechanism based on textual information while the second one combines textual and image information to help users find a target image. The experimental results show that VCT_ICLEF had a better performance in almost all cases. Overall, it helped users find the topic image within a fewer iterations with a maximum of 2 iterations saved. Our user survey also reported that a combination of textual and visual information is helpful to indicate to the system what a user really wanted in mind.

Archive | 2007