Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dongfeng Cai is active.

Publication


Featured researches published by Dongfeng Cai.


international conference natural language processing | 2011

A collocation based approach for prepositional phrase identification

Dongfeng Cai; Ling Zhang; Qiaoli Zhou; Yue Zhao

Prepositional phrase (PP) consists of two parts which are a preposition as the leading part and a word or phrase as the tail part. In accordance with this fact, this paper proposes a new approach for identifying PP. In this method, PP identification is transformed into the collocation identification of preposition itself and the right boundary word. The Cascaded Conditional Random Fields (CCRFs) is used in this approach. With the Penn Chinese Treebank 5.1 as our experiment corpus, the F1 rises to 94.63%. This approach obtains breakthrough in this specific field as the current F1 is about 8.6% higher than any publicly published paper.


international conference natural language processing | 2009

Chinese maximal noun phrase parsing based on cascaded conditional random fields

Dongfeng Cai; Xin Liu; Qiaoli Zhou; Na Ye

This paper proposes an approach for Chinese Maximal Noun Phrase parsing based on Cascaded Conditional Random Fields. In this approach, the parse tree of Chinese Maximal Noun Phrase is constructed layer by layer. The Chinese chunks are first recognized by the lower Conditional Random Fields model, then the result is passed as input to the higher model for recognition of phrases, the process of recognizing phrases is continued until no new phrases are discovered. Post-processing rules are constructed between the lower and higher models to modify the erroneous recognition of Chinese chunks, and finally the phrase structure tree of the Chinese Maximal Noun Phrase is constructed. In open test, our Chinese Maximal Noun Phrase parser achieves F1-score of 92.02%.


China Workshop on Machine Translation | 2016

Automatic Construction of Domain Terminology Knowledge Base for HowNet Based on the Headword

Chuang Wu; Lin Wang; Na Ye; Guiping Zhang; Dongfeng Cai

HowNet is a Chinese-English Bilingual common-sense knowledge base, playing an important role in machine translation tasks. However, when facing domain-specific machine translation tasks, HowNet must be supplemented with domain-specific terminologies. In other words, we need to construct domain terminology semantic knowledge base. In this paper, we propose a method to automatically construct domain terminology knowledge base, based on the headword of a terminology. Specifically, the semantic meaning (HowNet DEF) of an unseen terminology is defined as one of the semantic meanings of the headword of the terminology. Headword disambiguation is done by considering the context of headwords and adding domain-specific disambiguation rules to the general disambiguation rules. Experiments on aviation domain show that our proposed method on headword disambiguation achieves 9.4% improvement based on the default disambiguation tools in HowNet. We also find that with our automatically constructed domain terminology knowledge base, HowNet machine translation system achieves better translation quality.


workshop on chinese lexical semantics | 2015

Incorporating Prepositional Phrase Classification Knowledge in Prepositional Phrase Identification

Qiaoli Zhou; Ling Zhang; Na Ye; Dongfeng Cai

This paper proposes a method of prepositional phrase (PP) identification by incorporating PP classification knowledge. When PPs act as different syntactic constituents, they have different characteristics in terms of location and context. In this paper, PPs are classified based on the context in which they appear. We select features based on the category of PPs to train multiple machine learning models for PP identification, and recombine these identification results. In this way, we can make full use of the complementary advantage of multiple models.


international conference on asian language processing | 2015

Improving interactive machine translation via multiple positive constraints

Na Ye; Yitao Fu; Guiping Zhang; Chuang Wu; Dongfeng Cai

Since machine translation systems are still unable to produce satisfactory outputs, recently various interactive machine translation (IMT) approaches are proposed. State-of-the-art IMT systems use the human validated prefix as the only constraint that guides decoding, in which the human guidance is quite insufficient. This paper extends the human-computer interactions by allowing translators to provide multiple correct fragments (CFs) besides prefix. These fragments act as positive constraints for decoding. Four improvements are proposed to adapt traditional IMT to this new interaction mode. Experimental results on Chinese-English corpora show that our method achieves lower KSMT scores under comparable prediction speed in comparison with the traditional IMT method.


international conference natural language processing | 2011

Study on assistant concept acquisition in domain ontology construction for Chinese texts

Guiping Zhang; Xiaoying Zhang; Peiyan Wang; Dongfeng Cai

Concept acquisition is an important part of domain ontology construction, and how to accomplish assistant concept acquisition becomes a research focus. In this paper, a character-based CRF model is adopted to obtain the set of candidate terms, and we propose an active learning algorithm to select a concept from the set of candidate terms for the user and use the stochastic gradient descent algorithm for training the weight of concepts. The experiment results show that this algorithm can effectively assist user acquire domain concepts, when the set of correct terms identified by the CRF model is used as candidate concepts, the value of MAP reaches 0.9335.


international conference natural language processing | 2010

A new cascade algorithm based on CRFs for recognizing Chinese verb-object collocation

Guiping Zhang; Zhichao Liu; Qiaoli Zhou; Dongfeng Cai; Jiao Cheng

This paper proposes a new cascade algorithm based on conditional random fields. The algorithm is applied to automatic recognition of Chinese verb-object collocation, and combined with a new sequence labeling of “ONIY”. Experiments compare identified results under two segmentations and part-of-speech tag sets. The comprehensive experimental results show that the best performance is 90.65 % in F-score over Tsinghua Treebank, and 82.00 % in F-score over the segmentation and part-of-speech tagging scheme of Peking University. Our experiments show that the proposed algorithm can greatly improve recognition accuracy of multi-nested collocation, and play a positive role on long distance collocation.


international conference natural language processing | 2009

A weakly supervised optimize method in latent semantic indexing

Duo Ji; Dongbo Guo; Dongfeng Cai; Yu Bai

Latent Semantic Indexing (LSI) is an effective method in the way of feature extraction, which has been applied to many text learning tasks, such as text clustering and information retrieval. This paper thoroughly analyses the influence of term co-occurrences on the mapping of Latent Semantic Indexing and brings forward a method named pseudo document which strengthens the beneficial term co-occurrences by adding heuristic knowledge to text collection so as to make the mapping of Latent Semantic Indexing more reasonable. The experimental results show that the method named pseudo document can effectively improve the performance of patent retrieval.


Archive | 2008

Large scale text data external clustering method and system

Duo Ji; Dongfeng Cai; Guiping Zhang; Baosheng Yin; Xuelei Miao; Qiaoli Zhou; Yu Bai


joint conference on lexical and computational semantics | 2012

A divide-and-conquer strategy for semantic dependency parsing

Qiaoli Zhou; Ling Zhang; Fei Liu; Dongfeng Cai; Guiping Zhang

Collaboration


Dive into the Dongfeng Cai's collaboration.

Top Co-Authors

Avatar

Guiping Zhang

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Qiaoli Zhou

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Na Ye

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Ling Zhang

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Chuang Wu

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Duo Ji

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dongbo Guo

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Fei Liu

Shenyang Aerospace University

View shared research outputs
Top Co-Authors

Avatar

Jiao Cheng

Shenyang Aerospace University

View shared research outputs
Researchain Logo
Decentralizing Knowledge