Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yajuan Lü is active.

Publication


Featured researches published by Yajuan Lü.


international joint conference on natural language processing | 2009

Improving Tree-to-Tree Translation with Packed Forests

Yang Liu; Yajuan Lü; Qun Liu

Current tree-to-tree models suffer from parsing errors as they usually use only 1-best parses for rule extraction and decoding. We instead propose a forest-based tree-to-tree model that uses packed forests. The model is based on a probabilistic synchronous tree substitution grammar (STSG), which can be learned from aligned forest pairs automatically. The decoder finds ways of decomposing trees in the source forest into elementary trees using the source projection of STSG while building target forest in parallel. Comparable to the state-of-the-art phrase-based system Moses, using packed forests in tree-to-tree translation results in a significant absolute improvement of 3.6 BLEU points over using 1-best trees.


Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications | 2009

Improving Statistical Machine Translation Using Domain Bilingual Multiword Expressions

Zhixiang Ren; Yajuan Lü; Jie Cao; Qun Liu; Yun Huang

Multiword expressions (MWEs) have been proved useful for many natural language processing tasks. However, how to use them to improve performance of statistical machine translation (SMT) is not well studied. This paper presents a simple yet effective strategy to extract domain bilingual multiword expressions. In addition, we implement three methods to integrate bilingual MWEs to Moses, the state-of-the-art phrase-based machine translation system. Experiments show that bilingual MWEs could improve translation performance significantly.


meeting of the association for computational linguistics | 2004

Collocation Translation Acquisition Using Monolingual Corpora

Yajuan Lü; Ming Zhou

Collocation translation is important for machine translation and many other NLP tasks. Unlike previous methods using bilingual parallel corpora, this paper presents a new method for acquiring collocation translations by making use of monolingual corpora and linguistic knowledge. First, dependency triples are extracted from Chinese and English corpora with dependency parsers. Then, a dependency triple translation model is estimated using the EM algorithm based on a dependency correspondence assumption. The generated triple translation model is used to extract collocation translations from two monolingual corpora. Experiments show that our approach outperforms the existing monolingual corpus based methods in dependency triple translation and achieves promising results in collocation translation extraction.


international conference on computational linguistics | 2002

Learning Chinese bracketing knowledge based on a bilingual language model

Yajuan Lü; Sheng Li; Tiejun Zhao; Muyun Yang

This paper proposes a new method for automatic acquisition of Chinese bracketing knowledge from English-Chinese sentence-aligned bilingual corpora. Bilingual sentence pairs are first aligned in syntactic structure by combining English parse trees with a statistical bilingual language model. Chinese bracketing knowledge is then extracted automatically. The preliminary experiments show automatically learned knowledge accords well with manually annotated brackets. The proposed method is particularly useful to acquire bracketing knowledge for a less studied language that lacks tools and resources found in a second language more studied. Although this paper discusses experiments with Chinese and English, the method is also applicable to other language pairs.


meeting of the association for computational linguistics | 2009

Reducing SMT Rule Table with Monolingual Key Phrase

Zhongjun He; Yao Meng; Yajuan Lü; Hao Yu; Qun Liu

This paper presents an effective approach to discard most entries of the rule table for statistical machine translation. The rule table is filtered by monolingual key phrases, which are extracted from source text using a technique based on term extraction. Experiments show that 78% of the rule table is reduced without worsening translation performance. In most cases, our approach results in measurable improvements in BLEU score.


Computational Linguistics | 2015

Automatic adaptation of annotations

Wenbin Jiang; Yajuan Lü; Liang Huang; Qun Liu

Manually annotated corpora are indispensable resources, yet for many annotation tasks, such as the creation of treebanks, there exist multiple corpora with different and incompatible annotation guidelines. This leads to an inefficient use of human expertise, but it could be remedied by integrating knowledge across corpora with different annotation guidelines. In this article we describe the problem of annotation adaptation and the intrinsic principles of the solutions, and present a series of successively enhanced models that can automatically adapt the divergence between different annotation formats.We evaluate our algorithms on the tasks of Chinese word segmentation and dependency parsing. For word segmentation, where there are no universal segmentation guidelines because of the lack of morphology in Chinese, we perform annotation adaptation from the much larger Peoples Daily corpus to the smaller but more popular Penn Chinese Treebank. For dependency parsing, we perform annotation adaptation from the Penn Chinese Treebank to a semantics-oriented Dependency Treebank, which is annotated using significantly different annotation guidelines. In both experiments, automatic annotation adaptation brings significant improvement, achieving state-of-the-art performance despite the use of purely local features in training.


Chinese Physics | 2006

Induced growth of high quality ZnO thin films by crystallized amorphous ZnO

Zeheng Wang; L J Song; Shouchun Li; Yajuan Lü; Yunxia Tian; Jing-yao Liu; Lianyuan Wang

This paper reports the induced growth of high quality ZnO thin film by crystallized amorphous ZnO. Firstly amorphous ZnO was prepared by solid-state pyrolytic reaction, then by taking crystallized amorphous ZnO as seeds (buffer layer), ZnO thin films have been grown in diethyene glycol solution of zinc acetate at 80°C. X-ray Diffraction curve indicates that the films were preferentially oriented [001] out-of-plane direction of the ZnO. Atomic force microscopy and scanning electron microscopy were used to evaluate the surface morphology of the ZnO thin film. Photoluminescence spectrum exhibits a strong ultraviolet emission while the visible emission is very weak. The results indicate that high quality ZnO thin film was obtained.


international universal communication symposium | 2010

A fixed-point decoding approach for statistical machine translation on mobile terminals

Xiang Li; Jin’an Xu; Wenbin Jiang; Qun Liu; Yajuan Lü

The demand for statistical machine translation on mobile terminals is increasing rapidly, but translation speed is restricted by the embedded processors without a floating-point unit. This paper proposes an approach to convert floating-point numbers into fixed-point numbers for SMT decoding on mobile terminals in order to reduce the impact of the processors without a floating-point unit on translation speed. The experiments based on PC and mobile terminal show that this approach ensures the quality of translation and the speed of fixed-point arithmetic operations is 135.6% faster than that of floating-point arithmetic operations. Therefore, this approach can efficiently improve translation speed of SMT systems on mobile terminals with weak ability in floating-point arithmetic operations.


international conference on asian language processing | 2013

Rule Refinement for Spoken Language Translation by Retrieving the Missing Translation of Content Words

Linfeng Song; Jun Xie; Xing Wang; Yajuan Lü; Qun Liu

Spoken language translation usually suffers from the missing translation of content words, failing to generate the appropriate translation. In this paper we propose a novel Mutual Information based method to improve spoken language translation by retrieving the missing translation of content words. We exploit several features that indicate how well the inner content words are translated for each rule to let MT systems select better translation rules. Experimental results show that our method can improve translation performance significantly ranging from 1.95 to 4.47 BLEU points on different test sets.


international universal communication symposium | 2010

Multilingual Machine Translation system

Yajuan Lü; Yang Liu; Qun Liu

We present a Multilingual MachineTranslation(http://mtgroup.ict.ac.cn/transRoom.php), which can translation text between Chinese and English, Korean, Uyghur, Tibetan, Monogolian, and etc. The system is developed based the Maximum Entropy based Bracketing Transduction Grammar (ME-BTG) Model, which is a kind of formal syntax-based statistical translation model. All the translation engines are trained using sentenced aligned bilingual corpus.

Collaboration


Dive into the Yajuan Lü's collaboration.

Top Co-Authors

Avatar

Qun Liu

Dublin City University

View shared research outputs
Top Co-Authors

Avatar

Wenbin Jiang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Haitao Mi

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Hao Xiong

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jun Xie

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Fandong Meng

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

J.C. Yan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jin Huang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

L. Huang

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge