Linfeng Song
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Linfeng Song.
conference on computational natural language learning | 2015
Xiaochang Peng; Linfeng Song; Daniel Gildea
This paper presents a synchronous-graphgrammar-based approach for string-toAMR parsing. We apply Markov Chain Monte Carlo (MCMC) algorithms to learn Synchronous Hyperedge Replacement Grammar (SHRG) rules from a forest that represents likely derivations consistent with a fixed string-to-graph alignment. We make an analogy of string-toAMR parsing to the task of phrase-based machine translation and come up with an efficient algorithm to learn graph grammars from string-graph pairs. We propose an effective approximation strategy to resolve the complexity issue of graph compositions. We also show some useful strategies to overcome existing problems in an SHRG-based parser and present preliminary results of a graph-grammar-based approach.
empirical methods in natural language processing | 2016
Linfeng Song; Yue Zhang; Xiaochang Peng; Zhiguo Wang; Daniel Gildea
The task of AMR-to-text generation is to generate grammatical text that sustains the semantic meaning for a given AMR graph. We at- tack the task by first partitioning the AMR graph into smaller fragments, and then generating the translation for each fragment, before finally deciding the order by solving an asymmetric generalized traveling salesman problem (AGTSP). A Maximum Entropy classifier is trained to estimate the traveling costs, and a TSP solver is used to find the optimized solution. The final model reports a BLEU score of 22.44 on the SemEval-2016 Task8 dataset.
empirical methods in natural language processing | 2014
Yue Zhang; Kai Song; Linfeng Song; Jingbo Zhu; Qun Liu
We study a novel architecture for syntactic SMT. In contrast to the dominant approach in the literature, the system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. Target syntax features and bilingual translation features are trained consistently in a discriminative model. Experiments using the IWSLT 2010 dataset show that the system achieves BLEU comparable to the state-of-the-art syntactic SMT systems.
meeting of the association for computational linguistics | 2017
Linfeng Song; Xiaochang Peng; Yue Zhang; Zhiguo Wang; Daniel Gildea
This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar. During training, graph-to-string rules are learned using a heuristic extraction algorithm. At test time, a graph transducer is applied to collapse input AMRs and generate output sentences. Evaluated on SemEval-2016 Task 8, our method gives a BLEU score of 25.62, which is the best reported so far.
joint conference on lexical and computational semantics | 2016
Linfeng Song; Zhiguo Wang; Haitao Mi; Daniel Gildea
Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines.
north american chapter of the association for computational linguistics | 2018
Linfeng Song; Zhiguo Wang; Wael Hamza; Yue Zhang; Daniel Gildea
The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer. It is useful for enlarging the training set of QA systems. Previous work has adopted sequence-to-sequence models that take a passage with an additional bit to indicate answer position as input. However, they do not explicitly model the information between answer and other context within the passage. We propose a model that matches the answer with the passage before generating the question. Experiments show that our model outperforms the existing state of the art using rich features.
workshop on chinese lexical semantics | 2013
Xing Wang; Jun Xie; Linfeng Song; Yajuan Lv; Jianmin Yao
When hierarchical phrase-based statistical machine translation systems are used for language translation, sometimes the translations’ content words were lost: source-side content words is empty when translated into target texts during decoding. Although the translations’ BLEU score is very high, it is difficult to understand the translations because of the loss of the content words. In this paper, we propose a basic and efficient method for phrase filtering, with which the phrase’ content words translation are checked to decide whether to use the phrase in decoding or not. The experimental results show that the proposed method alleviates the problem of the loss content words’ and improves the BLEU scores.
international conference on asian language processing | 2013
Linfeng Song; Jun Xie; Xing Wang; Yajuan Lü; Qun Liu
Spoken language translation usually suffers from the missing translation of content words, failing to generate the appropriate translation. In this paper we propose a novel Mutual Information based method to improve spoken language translation by retrieving the missing translation of content words. We exploit several features that indicate how well the inner content words are translated for each rule to let MT systems select better translation rules. Experimental results show that our method can improve translation performance significantly ranging from 1.95 to 4.47 BLEU points on different test sets.
national conference on artificial intelligence | 2014
Linfeng Song; Yue Zhang; Kai Song; Qun Liu
empirical methods in natural language processing | 2013
Fandong Meng; Jun Xie; Linfeng Song; Yajuan Lü; Qun Liu