Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yufang Sun is active.

Publication


Featured researches published by Yufang Sun.


empirical methods in natural language processing | 2000

Word Alignment of English-Chinese Bilingual Corpus Based on Chucks

Le Sun; Youbing Jin; Lin Du; Yufang Sun

In this paper, a method for the word alignment of English-Chinese corpus based on chunks is proposed. The chunks of English sentences are identified firstly. Then the chunk boundaries of Chinese sentences are predicted by the translations of English chunks and heuristic information. The ambiguities of Chinese chunk boundaries are resolved by the coterminous words in English chunks. With the chunk aligned bilingual corpus, a translation relation probability is proposed to align words. Finally, we evaluate our system by real corpus and present the experiment results.


international conference on computational linguistics | 2002

Constructing of a large-scale Chinese-English parallel corpus

Le Sun; Song Xue; Weimin Qu; Xiaofeng Wang; Yufang Sun

This paper describes the constructing of a large-scale (above 500,000 pair sentences) Chinese-English parallel corpus. The current status of Chinese corpora is overviewed with the emphasis on parallel corpus. The XML coding principles for Chinese--English parallel corpus are discussed. The sentence alignment algorithm used in this project is described with a computer-aided checking processing. Finally, we show the design of the concordance of the parallel corpus and the prospect to further development.


empirical methods in natural language processing | 2000

Query Translation in Chinese-English Cross-Language Information Retrieval

Yibo Zhang; Le Sun; Lin Du; Yufang Sun

This paper proposed a new query translation method based on the mutual information matrices of terms in the Chinese and English corpora. Instead of looking up a bilingual phrase dictionary, the compositional phrase (the translation of phrase can be derived from the translation of its components) in the query can be indirectly translated via a general-purpose Chinese-English dictionary look-up procedure. A novel selection method for translations of query terms is also presented in detail. Our query translation method ultimately constructs an English query in which each query term has a weight. The evaluation results show that the retrieval performance achieved by our query translation method is about 73% of monolingual information retrieval and is about 28% higher than that of simple word-by-word translation way.


systems, man and cybernetics | 2003

A framework for domain-specific search engine: design pattern perspective

Junlin Zhang; Weimin Qu; Lin Du; Yufang Sun

Domain specific research engine has many advantages over its generic counterparts such as high precision and short update period. However it is a time consuming job to construct these applications. We introduce an object-oriented framework for domain specific search engine applications, which contain some design patterns that contribute to this object-oriented architecture, revealing the frameworks structure and the forces that shaped it. Using this framework we fix a basic architecture and thus increase ability to construct domain specific search engine application.


systems man and cybernetics | 2001

PECAT: a computer-aided translation tool based on bilingual corpora

Le Sun; Yibo Zhang; Junlin Zhang; Yufang Sun

With the widespread use of computers in translation work and daily life, there are more and more bilingual corpora becoming available. In this paper, the PECAT (Pilot English-Chinese Computer-Aided Translation) system, based on bilingual corpora, is described. There are mainly three modules in our system: a corpus-processing module, a sentence-matching module and a post-editing module. In order to increase the coverage of input source sentences, the text alignment in the corpus-processing module is based on the chunk level. The matching algorithm of input sentences and source sentences is a three-layer edit-distance algorithm guided by the user, which includes information about word morphology and part-of-speech (POS). Preliminary experiments show promising results.


Archive | 1999

Sentence Alignment of English-Chinese Complex Bilingual Corpora

Le Sun; Lin Du; Yufang Sun; Jin Youbin


text retrieval conference | 2004

ISCAS at TREC-2004: HARD Track

Le Sun; Junlin Zhang; Yufang Sun


NTCIR | 2002

ISCAS at NTCIR-3: Monolingual, Bilingual and MultiLingual IR Tasks

Junlin Zhang; Le Sun; Weimin Qu; Lin Du; Yufang Sun; Yangxing Fan; Zhigen Lin


NTCIR | 2001

ISCAS: Text Retrieval in NTCIR Workshop II

Yibo Zhang; Le Sun; Lin Du; Youbing Jin; Yufang Sun


language resources and evaluation | 2000

Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora.

Le Sun; Youbing Jin; Lin Du; Yufang Sun

Collaboration


Dive into the Yufang Sun's collaboration.

Top Co-Authors

Avatar

Le Sun

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Lin Du

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Junlin Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Weimin Qu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yibo Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Youbing Jin

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge