Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Feifei Zhai is active.

Publication


Featured researches published by Feifei Zhai.


Archive | 2012

Handling Unknown Words in Statistical Machine Translation from a New Perspective

Jiajun Zhang; Feifei Zhai; Chengqing Zong

Unknown words are one of the key factors which drastically impact the translation quality. Traditionally, nearly all the related research work focus on obtaining the translation of the unknown words in different ways. In this paper, we propose a new perspective to handle unknown words in statistical machine translation. Instead of trying great effort to find the translation of unknown words, this paper focuses on determining the semantic function the unknown words serve as in the test sentence and keeping the semantic function unchanged in the translation process. In this way, unknown words will help the phrase reordering and lexical selection of their surrounding words even though they still remain untranslated. In order to determine the semantic function of each unknown word, this paper employs the distributional semantic model and the bidirectional language model. Extensive experiments on Chinese-to-English translation show that our methods can substantially improve the translation quality.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars

Jiajun Zhang; Feifei Zhai; Chengqing Zong

Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule “ VP→(x0x1,VP:x1PP:x0)” in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal VP:x1. Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality.


meeting of the association for computational linguistics | 2014

RNN-based Derivation Structure Prediction for SMT

Feifei Zhai; Jiajun Zhang; Yu Zhou; Chengqing Zong

In this paper, we propose a novel derivation structure prediction (DSP) model for SMT using recursive neural network (RNN). Within the model, two steps are involved: (1) phrase-pair vector representation, to learn vector representations for phrase pairs; (2) derivation structure prediction, to generate a bilingual RNN that aims to distinguish good derivation structures from bad ones. Final experimental results show that our DSP model can significantly improve the translation quality.


Journal of Computer Science and Technology | 2013

A Substitution-Translation-Restoration Framework for Handling Unknown Words in Statistical Machine Translation

Jiajun Zhang; Feifei Zhai; Chengqing Zong

Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches have two disadvantages. On the one hand, they usually rely on many additional resources such as bilingual web data; on the other hand, they cannot guarantee good reordering and lexical selection of surrounding words. This paper gives a new perspective on handling unknown words in statistical machine translation (SMT). Instead of making great efforts to find the translation of unknown words, we focus on determining the semantic function of the unknown word in the test sentence and keeping the semantic function unchanged in the translation process. In this way, unknown words can help the phrase reordering and lexical selection of their surrounding words even though they still remain untranslated. In order to determine the semantic function of an unknown word, we employ the distributional semantic model and the bidirectional language model. Extensive experiments on both phrase-based and linguistically syntax-based SMT models in Chinese-to-English translation show that our method can substantially improve the translation quality.


empirical methods in natural language processing | 2015

Search-Aware Tuning for Hierarchical Phrase-based Decoding

Feifei Zhai; Liang Huang; Kai Zhao

Parameter tuning is a key problem for statistical machine translation (SMT). Most popular parameter tuning algorithms for SMT are agnostic of decoding, resulting in parameters vulnerable to search errors in decoding. The recent research of “search-aware tuning” (Liu and Huang, 2014) addresses this problem by considering the partial derivations in every decoding step so that the promising ones are more likely to survive the inexact decoding beam. We extend this approach from phrase-based translation to syntaxbased translation by generalizing the evaluation metrics for partial translations to handle tree-structured derivations in a way inspired by inside-outside algorithm. Our approach is simple to use and can be applied to most of the conventional parameter tuning methods as a plugin. Extensive experiments on Chinese-to-English translation show significant BLEU improvements on MERT, MIRA and PRO.


national conference on artificial intelligence | 2016

SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents.

Ramesh Nallapati; Feifei Zhai; Bowen Zhou


empirical methods in natural language processing | 2011

Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax

Jiajun Zhang; Feifei Zhai; Chengqing Zong


national conference on artificial intelligence | 2017

Neural Models for Sequence Chunking.

Feifei Zhai; Saloni Potdar; Bing Xiang; Bowen Zhou


international conference on computational linguistics | 2012

Machine Translation by Modeling Predicate-Argument Structure Transformation

Feifei Zhai; Jiajun Zhang; Yu Zhou; Chengqing Zong


international conference on computational linguistics | 2012

Tree-based Translation without using Parse Trees

Feifei Zhai; Jiajun Zhang; Yu Zhou; Chengqing Zong

Collaboration


Dive into the Feifei Zhai's collaboration.

Top Co-Authors

Avatar

Chengqing Zong

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jiajun Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yu Zhou

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Peng Liu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yining Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Kai Zhao

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Liang Huang

City University of New York

View shared research outputs
Researchain Logo
Decentralizing Knowledge