Lee-Feng Chien | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lee-Feng Chien is active.

Explore More

Publication

Featured researches published by Lee-Feng Chien.

international acm sigir conference on research and development in information retrieval | 1997

PAT-tree-based keyword extraction for Chinese information retrieval

Lee-Feng Chien

urgent need to promote Chinese in this paper we will raise the significance of keyword extraction using a new PAT-treebased approach, which is efficient in automatic keyword extraction from a set of relevant Chinese documents. This approach has been successfully applied in several IR researches, such as document classification, book indexing and relevance feedback. Many Chinese language processing applications therefore step ahead from character level to word/phrase level,

Journal of the Association for Information Science and Technology | 2003

Relevant term suggestion in interactive web search based on contextual information in query session logs

Chien-Kang Huang; Lee-Feng Chien; Yen-Jen Oyang

This paper proposes an effective term suggestion approach to interactive Web search. Conventional approaches to making term suggestions involve extracting co-occurring keyterms from highly ranked retrieved documents. Such approaches must deal with term extraction difficulties and interference from irrelevant documents, and, more importantly, have difficulty extracting terms that are conceptually related but do not frequently co-occur in documents. In this paper, we present a new, effective log-based approach to relevant term extraction and term suggestion. Using this approach, the relevant terms suggested for a user query are those that co-occur in similar query sessions from search engine logs, rather than in the retrieved documents. In addition, the suggested terms in each interactive search step can be organized according to its relevance to the entire query session, rather than to the most recent single query as in conventional approaches. The proposed approach was tested using a proxy server log containing about two million query transactions submitted to search engines in Taiwan. The obtained experimental results show that the proposed approach can provide organized and highly relevant terms, and can exploit the contextual information in a users query session to make more effective suggestions.

international conference on speech image processing and neural networks | 1994

Golden Mandarin(II)-an intelligent Mandarin dictation machine for Chinese character input with adaptation/learning functions

Lin-Shan Lee; Keh-Jiann Chen; Chiu-yu Tseng; Ren-Yuan Lyu; Lee-Feng Chien; Hsin-Min Wang; Jia-Lin Shen; Sung-Chien Lin; Yen-Ju Yang; Bo-Ren Bai; Chi-ping Nee; Chun-Yi Liao; Shueh-Sheng Lin; Chung-Shu Yang; I-Jung Hung; Ming-Yu Lee; Rei-Chang Wang; Bo-Shen Lin; Yuan-Cheng Chang; Rung-Chiung Yang; Yung-Chi Huang; Chen-Yuan Lou; Tung-Sheng Lin

Golden Mandarin (II) is an intelligent single-chip based real-time Mandarin dictation machine for the Chinese language with a very large vocabulary for the input of unlimited Chinese texts into computers using voice. This dictation machine can be installed on any personal computer, in which only a single chip Motorola DSP 96002D is used, with a preliminary character correct rate around 95% at a speed of 0.6 sec per character. Various adaptation/learning functions have been developed for this machine, including fast adaptation to new speakers, on-line learning the voice characteristics, task domains, word pattern and noise environments of the users, so the machine can be easily personalized for each user. These adaptation/learning functions are the major subjects of the paper.<<ETX>>

international acm sigir conference on research and development in information retrieval | 2004

Translating unknown queries with web corpora for cross-language information retrieval

Pu-Jen Cheng; Jei Wen Teng; Ruei Cheng Chen; Jenq-Haur Wang; Wen Hsiang Lu; Lee-Feng Chien

It is crucial for cross-language information retrieval (CLIR) systems to deal with the translation of unknown queries due to that real queries might be short. The purpose of this paper is to investigate the feasibility of exploiting the Web as the corpus source to translate unknown queries for CLIR. We propose an online translation approach to determine effective translations for unknown query terms via mining of bilingual search-result pages obtained from Web search engines. This approach can alleviate the problem of the lack of large bilingual corpora, translate many unknown query terms, provide flexible query specifications, and extract semantically-close translations to benefit CLIR tasks -- especially for cross-language Web search.

conference on information and knowledge management | 2004

A practical web-based approach to generating topic hierarchy for text segments

Shui-Lung Chuang; Lee-Feng Chien

It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper, we address the problem of generating topic hierarchies for diverse text segments with a general and practical approach that uses the Web as an additional knowledge source. Unlike long documents, short text segments typically do not contain enough information to extract reliable features. This work investigates the possibilities of using highly ranked search-result snippets to enrich the representation of text segments. A hierarchical clustering algorithm is then applied to create the hierarchical topic structure of text segments. Different from traditional clustering algorithms, which tend to produce cluster hierarchies with a very unnatural shape, the approach tries to produce a more natural and comprehensive hierarchy. Extensive experiments were conducted on different domains of text segments. The obtained results have shown the potential of the proposed approach, which is believed able to benefit many information systems.

ACM Transactions on Information Systems | 2004

Anchor text mining for translation of Web queries: A transitive translation approach

Wen Hsiang Lu; Lee-Feng Chien; Hsi-Jian Lee

To discover translation knowledge in diverse data resources on the Web, this article proposes an effective approach to finding translation equivalents of query terms and constructing multilingual lexicons through the mining of Web anchor texts and link structures. Although Web anchor texts are wide-scoped hypertext resources, not every particular pair of languages contains sufficient anchor texts for effective extraction of translations for Web queries. For more generalized applications, the approach is designed based on a transitive translation model. The translation equivalents of a query term can be extracted via its translation in an intermediate language. To reduce interference from translation errors, the approach further integrates a competitive linking algorithm into the process of determining the most probable translation. A series of experiments has been conducted, including performance tests on term translation extraction, cross-language information retrieval, and translation suggestions for practical Web search services, respectively. The obtained experimental results have shown that the proposed approach is effective in extracting translations of unknown queries, is easy to combine with the probabilistic retrieval model to improve the cross-language retrieval performance, and is very useful when the considered language pairs lack a sufficient number of anchor texts. Based on the approach, an experimental system called LiveTrans has been developed for English--Chinese cross-language Web search.

international conference on acoustics, speech, and signal processing | 1993

Golden Mandarin (II)-an improved single-chip real-time Mandarin dictation machine for Chinese language with very large vocabulary

Lin-Shan Lee; Chiu-yu Tseng; Keh-Jiann Chen; I-Jung Hung; Ming-Yu Lee; Lee-Feng Chien; Yumin Lee; Ren-Yuan Lyu; Hsin-Min Wang; Yung-Chuan Wu; Tung-Sheng Lin; Hung-yan Gu; Chi-ping Nee; Chun-Yi Liao; Yeng-Ju Yang; Yuan-Cheng Chang; Rung-Chiung Yang

Golden Mandarin (II) is an improved single-chip real-time Mandarin dictation machine with a very large vocabulary for the input of unlimited Chinese sentences into computers using voice. In this dictation machine only a single-chip Motorola DSP 96002D on an Ariel DSP-96 card is used, with a preliminary character correct rate of around 95% in speaker-dependent mode at a speech of 0.36 s per character. This is achieved by many new techniques, primarily a segmental probability modeling technique for syllable recognition especially considering the characteristics of Mandarin syllables, and a word-lattice-based Chinese character bigram for character identification especially considering the structure of the Chinese language.<<ETX>>

ACM Transactions on Asian Language Information Processing | 2002

Translation of web queries using anchor text mining

Wen Hsiang Lu; Lee-Feng Chien; Hsi-Jian Lee

This article presents an approach to automatically extracting translations of Web query terms through mining of Web anchor texts and link structures. One of the existing difficulties in cross-language information retrieval (CLIR) and Web search is the lack of appropriate translations of new terminology and proper names. The proposed approach successfully exploits the anchor-text resources and reduces the existing difficulties of query term translation. Many query terms that cannot be obtained in general-purpose translation dictionaries are, therefore, extracted.

international conference on data mining | 2002

Towards automatic generation of query taxonomy: a hierarchical query clustering approach

Shui-Lung Chuang; Lee-Feng Chien

Most previous work on automatic query clustering generated a flat, un-nested partition of query terms. In this work, we discuss the organization of query terms into a hierarchical structure and construct a query taxonomy in an automatic way. The proposed approach is designed based on a hierarchical agglomerative clustering algorithm to hierarchically group similar queries and generate cluster hierarchies using a novel cluster partition technique. The search processes of real-world search engines are combined to obtain highly ranked Web documents as the feature source for each query term. Preliminary experiments show that the proposed approach is effective for obtaining thesaurus information for query terms, and is also feasible for constructing a query taxonomy which provides a basis for in-depth analysis of users search interests and domain-specific vocabulary on a larger scale.

international conference on data mining | 2005

Text representation: from vector to tensor

Ning Liu; Benyu Zhang; Jun Yan; Zheng Chen; Wenyin Liu; Fengshan Bai; Lee-Feng Chien

In this paper, we propose a text representation model, Tensor Space Model (TSM), which models the text by multilinear algebraic high-order tensor instead of the traditional vector. Supported by techniques of multilinear algebra, TSM offers a potent mathematical framework for analyzing the multifactor structures. TSM is further supported by certain introduced particular operations and presented tools, such as the High-Order Singular Value Decomposition (HOSVD) for dimension reduction and other applications. Experimental results on the 20 Newsgroups dataset show that TSM is constantly better than VSM for text classification.

Explore More