Yufang Sun
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yufang Sun.
empirical methods in natural language processing | 2000
Le Sun; Youbing Jin; Lin Du; Yufang Sun
In this paper, a method for the word alignment of English-Chinese corpus based on chunks is proposed. The chunks of English sentences are identified firstly. Then the chunk boundaries of Chinese sentences are predicted by the translations of English chunks and heuristic information. The ambiguities of Chinese chunk boundaries are resolved by the coterminous words in English chunks. With the chunk aligned bilingual corpus, a translation relation probability is proposed to align words. Finally, we evaluate our system by real corpus and present the experiment results.
international conference on computational linguistics | 2002
Le Sun; Song Xue; Weimin Qu; Xiaofeng Wang; Yufang Sun
This paper describes the constructing of a large-scale (above 500,000 pair sentences) Chinese-English parallel corpus. The current status of Chinese corpora is overviewed with the emphasis on parallel corpus. The XML coding principles for Chinese--English parallel corpus are discussed. The sentence alignment algorithm used in this project is described with a computer-aided checking processing. Finally, we show the design of the concordance of the parallel corpus and the prospect to further development.
empirical methods in natural language processing | 2000
Yibo Zhang; Le Sun; Lin Du; Yufang Sun
This paper proposed a new query translation method based on the mutual information matrices of terms in the Chinese and English corpora. Instead of looking up a bilingual phrase dictionary, the compositional phrase (the translation of phrase can be derived from the translation of its components) in the query can be indirectly translated via a general-purpose Chinese-English dictionary look-up procedure. A novel selection method for translations of query terms is also presented in detail. Our query translation method ultimately constructs an English query in which each query term has a weight. The evaluation results show that the retrieval performance achieved by our query translation method is about 73% of monolingual information retrieval and is about 28% higher than that of simple word-by-word translation way.
systems, man and cybernetics | 2003
Junlin Zhang; Weimin Qu; Lin Du; Yufang Sun
Domain specific research engine has many advantages over its generic counterparts such as high precision and short update period. However it is a time consuming job to construct these applications. We introduce an object-oriented framework for domain specific search engine applications, which contain some design patterns that contribute to this object-oriented architecture, revealing the frameworks structure and the forces that shaped it. Using this framework we fix a basic architecture and thus increase ability to construct domain specific search engine application.
systems man and cybernetics | 2001
Le Sun; Yibo Zhang; Junlin Zhang; Yufang Sun
With the widespread use of computers in translation work and daily life, there are more and more bilingual corpora becoming available. In this paper, the PECAT (Pilot English-Chinese Computer-Aided Translation) system, based on bilingual corpora, is described. There are mainly three modules in our system: a corpus-processing module, a sentence-matching module and a post-editing module. In order to increase the coverage of input source sentences, the text alignment in the corpus-processing module is based on the chunk level. The matching algorithm of input sentences and source sentences is a three-layer edit-distance algorithm guided by the user, which includes information about word morphology and part-of-speech (POS). Preliminary experiments show promising results.
Archive | 1999
Le Sun; Lin Du; Yufang Sun; Jin Youbin
text retrieval conference | 2004
Le Sun; Junlin Zhang; Yufang Sun
NTCIR | 2002
Junlin Zhang; Le Sun; Weimin Qu; Lin Du; Yufang Sun; Yangxing Fan; Zhigen Lin
NTCIR | 2001
Yibo Zhang; Le Sun; Lin Du; Youbing Jin; Yufang Sun
language resources and evaluation | 2000
Le Sun; Youbing Jin; Lin Du; Yufang Sun