Linfeng Song | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Linfeng Song is active.

Explore More

Publication

Featured researches published by Linfeng Song.

conference on computational natural language learning | 2015

A Synchronous Hyperedge Replacement Grammar based approach for AMR parsing

Xiaochang Peng; Linfeng Song; Daniel Gildea

This paper presents a synchronous-graphgrammar-based approach for string-toAMR parsing. We apply Markov Chain Monte Carlo (MCMC) algorithms to learn Synchronous Hyperedge Replacement Grammar (SHRG) rules from a forest that represents likely derivations consistent with a fixed string-to-graph alignment. We make an analogy of string-toAMR parsing to the task of phrase-based machine translation and come up with an efficient algorithm to learn graph grammars from string-graph pairs. We propose an effective approximation strategy to resolve the complexity issue of graph compositions. We also show some useful strategies to overcome existing problems in an SHRG-based parser and present preliminary results of a graph-grammar-based approach.

empirical methods in natural language processing | 2016

AMR-to-text generation as a Traveling Salesman Problem

Linfeng Song; Yue Zhang; Xiaochang Peng; Zhiguo Wang; Daniel Gildea

The task of AMR-to-text generation is to generate grammatical text that sustains the semantic meaning for a given AMR graph. We at- tack the task by first partitioning the AMR graph into smaller fragments, and then generating the translation for each fragment, before finally deciding the order by solving an asymmetric generalized traveling salesman problem (AGTSP). A Maximum Entropy classifier is trained to estimate the traveling costs, and a TSP solver is used to find the optimized solution. The final model reports a BLEU score of 22.44 on the SemEval-2016 Task8 dataset.

empirical methods in natural language processing | 2014

Syntactic SMT Using a Discriminative Text Generation Model

Yue Zhang; Kai Song; Linfeng Song; Jingbo Zhu; Qun Liu

We study a novel architecture for syntactic SMT. In contrast to the dominant approach in the literature, the system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. Target syntax features and bilingual translation features are trained consistently in a discriminative model. Experiments using the IWSLT 2010 dataset show that the system achieves BLEU comparable to the state-of-the-art syntactic SMT systems.

meeting of the association for computational linguistics | 2017

AMR-to-text Generation with Synchronous Node Replacement Grammar

Linfeng Song; Xiaochang Peng; Yue Zhang; Zhiguo Wang; Daniel Gildea

This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar. During training, graph-to-string rules are learned using a heuristic extraction algorithm. At test time, a graph transducer is applied to collapse input AMRs and generate output sentences. Evaluated on SemEval-2016 Task 8, our method gives a BLEU score of 25.62, which is the best reported so far.

joint conference on lexical and computational semantics | 2016

Sense Embedding Learning for Word Sense Induction

Linfeng Song; Zhiguo Wang; Haitao Mi; Daniel Gildea

Conventional word sense induction (WSI) methods usually represent each instance with discrete linguistic features or cooccurrence features, and train a model for each polysemous word individually. In this work, we propose to learn sense embeddings for the WSI task. In the training stage, our method induces several sense centroids (embedding) for each polysemous word. In the testing stage, our method represents each instance as a contextual vector, and induces its sense by finding the nearest sense centroid in the embedding space. The advantages of our method are (1) distributed sense vectors are taken as the knowledge representations which are trained discriminatively, and usually have better performance than traditional count-based distributional models, and (2) a general model for the whole vocabulary is jointly trained to induce sense centroids under the mutlitask learning framework. Evaluated on SemEval-2010 WSI dataset, our method outperforms all participants and most of the recent state-of-the-art methods. We further verify the two advantages by comparing with carefully designed baselines.

north american chapter of the association for computational linguistics | 2018

LEVERAGING CONTEXT INFORMATION FOR NATURAL QUESTION GENERATION

Linfeng Song; Zhiguo Wang; Wael Hamza; Yue Zhang; Daniel Gildea

The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer. It is useful for enlarging the training set of QA systems. Previous work has adopted sequence-to-sequence models that take a passage with an additional bit to indicate answer position as input. However, they do not explicitly model the information between answer and other context within the passage. We propose a model that matches the answer with the passage before generating the question. Experiments show that our model outperforms the existing state of the art using rich features.

workshop on chinese lexical semantics | 2013

Phrase Filtering for Content Words in Hierarchical Phrase-Based Model

Xing Wang; Jun Xie; Linfeng Song; Yajuan Lv; Jianmin Yao

When hierarchical phrase-based statistical machine translation systems are used for language translation, sometimes the translations’ content words were lost: source-side content words is empty when translated into target texts during decoding. Although the translations’ BLEU score is very high, it is difficult to understand the translations because of the loss of the content words. In this paper, we propose a basic and efficient method for phrase filtering, with which the phrase’ content words translation are checked to decide whether to use the phrase in decoding or not. The experimental results show that the proposed method alleviates the problem of the loss content words’ and improves the BLEU scores.

international conference on asian language processing | 2013

Rule Refinement for Spoken Language Translation by Retrieving the Missing Translation of Content Words

Linfeng Song; Jun Xie; Xing Wang; Yajuan Lü; Qun Liu

Spoken language translation usually suffers from the missing translation of content words, failing to generate the appropriate translation. In this paper we propose a novel Mutual Information based method to improve spoken language translation by retrieving the missing translation of content words. We exploit several features that indicate how well the inner content words are translated for each rule to let MT systems select better translation rules. Experimental results show that our method can improve translation performance significantly ranging from 1.95 to 4.47 BLEU points on different test sets.

national conference on artificial intelligence | 2014