Jinsong Su | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jinsong Su is active.

Explore More

Publication

Featured researches published by Jinsong Su.

empirical methods in natural language processing | 2015

Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition

Biao Zhang; Jinsong Su; Deyi Xiong; Yaojie Lu; Hong Duan; Junfeng Yao

Implicit discourse relation recognition remains a serious challenge due to the absence of discourse connectives. In this paper, we propose a Shallow Convolutional Neural Network (SCNN) for implicit discourse relation recognition, which contains only one hidden layer but is effective in relation recognition. The shallow structure alleviates the overfitting problem, while the convolution and nonlinear operations help preserve the recognition and generalization ability of our model. Experiments on the benchmark data set show that our model achieves comparable and even better performance when comparing against current state-of-the-art systems.

empirical methods in natural language processing | 2016

Variational Neural Machine Translation.

Biao Zhang; Deyi Xiong; Jinsong Su; Hong Duan; Min Zhang

Models of neural machine translation are often from a discriminative family of encoderdecoders that learn a conditional distribution of a target sentence given a source sentence. In this paper, we propose a variational model to learn this conditional distribution for neural machine translation: a variational encoderdecoder model that can be trained end-to-end. Different from the vanilla encoder-decoder model that generates target translations from hidden representations of source sentences alone, the variational model introduces a continuous latent variable to explicitly model underlying semantics of source sentences and to guide the generation of target translations. In order to perform efficient posterior inference and large-scale training, we build a neural posterior approximator conditioned on both the source and the target sides, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on both Chinese-English and English- German translation tasks show that the proposed variational neural machine translation achieves significant improvements over the vanilla neural machine translation baselines.

empirical methods in natural language processing | 2015

Bilingual Correspondence Recursive Autoencoder for Statistical Machine Translation

Jinsong Su; Deyi Xiong; Biao Zhang; Yang Liu; Junfeng Yao; Min Zhang

Learning semantic representations and tree structures of bilingual phrases is beneficial for statistical machine translation. In this paper, we propose a new neural network model called Bilingual Correspondence Recursive Autoencoder (BCorrRAE) to model bilingual phrases in translation. We incorporate word alignments into BCorrRAE to allow it freely access bilingual constraints at different levels. BCorrRAE minimizes a joint objective on the combination of a recursive autoencoder reconstruction error, a structural alignment consistency error and a crosslingual reconstruction error so as to not only generate alignment-consistent phrase structures, but also capture different levels of semantic relations within bilingual phrases. In order to examine the effectiveness of BCorrRAE, we incorporate both semantic and structural similarity features built on bilingual phrase representations and tree structures learned by BCorrRAE into a state-of-the-art SMT system. Experiments on NIST Chinese-English test sets show that our model achieves a substantial improvement of up to 1.55 BLEU points over the baseline.

international joint conference on natural language processing | 2015

A Context-Aware Topic Model for Statistical Machine Translation

Jinsong Su; Deyi Xiong; Yang Liu; Xianpei Han; Hongyu Lin; Junfeng Yao; Min Zhang

Lexical selection is crucial for statistical machine translation. Previous studies separately exploit sentence-level contexts and documentlevel topics for lexical selection, neglecting their correlations. In this paper, we propose a context-aware topic model for lexical selection, which not only models local contexts and global topics but also captures their correlations. The model uses target-side translations as hidden variables to connect document topics and source-side local contextual words. In order to learn hidden variables and distributions from data, we introduce a Gibbs sampling algorithm for statistical estimation and inference. A new translation probability based on distributions learned by the model is integrated into a translation system for lexical selection. Experiment results on NIST ChineseEnglish test sets demonstrate that 1) our model significantly outperforms previous lexical selection methods and 2) modeling correlations between local words and global topics can further improve translation quality.

international conference on asian language processing | 2011

An Orientation Model for Hierarchical Phrase-Based Translation

Xinyan Xiao; Jinsong Su; Yang Liu; Qun Liu; Shouxun Lin

The hierarchical phrase-based (HPB) translation exploits the power of grammar to perform long distance reorderings, without specifying nonterminal orientations against adjacent blocks or considering the lexical information covered by nonterminals. In this paper, we borrow from phrase-based system the idea of orientation model to enhance the reordering ability of HPB translation. We distinguish three orientations (monotone, swap, discontinuous) of a nonterminal based on the alignment of grammar, and select the appropriate orientation of nonterminal using lexical information covered by it. By incorporating the orientation model, our approach significantly outperforms a standard HPB system up to 1.02 Bleu on large scale NIST Chinese-English translation task, and 0.51 Bleu on WMT German-English translation task.

empirical methods in natural language processing | 2016

Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition.

Changxing Wu; Xiaodong Shi; Yidong Chen; Yanzhou Huang; Jinsong Su

To alleviate the shortage of labeled data, we propose to use bilingually-constrained synthetic implicit data for implicit discourse relation recognition. These data are extracted from a bilingual sentence-aligned corpus according to the implicit/explicit mismatch between different languages. Incorporating these data via a multi-task neural network model achieves significant improvements over baselines, on both the English PDTB and Chinese CDTB data sets.

empirical methods in natural language processing | 2016

Variational Neural Discourse Relation Recognizer.

Biao Zhang; Deyi Xiong; Jinsong Su; Qun Liu; Rongrong Ji; Hong Duan; Min Zhang

Implicit discourse relation recognition is a crucial component for automatic discourselevel analysis and nature language understanding. Previous studies exploit discriminative models that are built on either powerful manual features or deep discourse representations. In this paper, instead, we explore generative models and propose a variational neural discourse relation recognizer. We refer to this model as VarNDRR. VarNDRR establishes a directed probabilistic model with a latent continuous variable that generates both a discourse and the relation between the two arguments of the discourse. In order to perform efficient inference and learning, we introduce neural discourse relation models to approximate the prior and posterior distributions of the latent variable, and employ these approximated distributions to optimize a reparameterized variational lower bound. This allows VarNDRR to be trained with standard stochastic gradient methods. Experiments on the benchmark data set show that VarNDRR can achieve comparable results against stateof- the-art baselines without using any manual features.

Neurocomputing | 2018

Lattice-to-sequence attentional Neural Machine Translation models

Zhixing Tan; Jinsong Su; Boli Wang; Yidong Chen; Xiaodong Shi

Abstract The dominant Neural Machine Translation (NMT) models usually resort to word-level modeling to embed input sentences into semantic space. However, it may not be optimal for the encoder modeling of NMT, especially for languages where tokenizations are usually ambiguous: On one hand, there may be tokenization errors which may negatively affect the encoder modeling of NMT. On the other hand, the optimal tokenization granularity is unclear for NMT. In this paper, we propose lattice-to-sequence attentional NMT models, which generalize the standard Recurrent Neural Network (RNN) encoders to lattice topology. Specifically, they take as input a word lattice which compactly encodes many tokenization alternatives, and learn to generate the hidden state for the current step from multiple inputs and hidden states in previous steps. Compared with the standard RNN encoder, the proposed encoders not only alleviate the negative impact of tokenization errors but are more expressive and flexible as well for encoding the meaning of input sentences. Experimental results on both Chinese–English and Japanese–English translations demonstrate the effectiveness of our models.

Information Sciences | 2018

A neural generative autoencoder for bilingual word embeddings

Jinsong Su; Shan Wu; Biao Zhang; Changxing Wu; Yue Qin; Deyi Xiong

Abstract Bilingual word embeddings (BWEs) have been shown to be useful in various cross-lingual natural language processing tasks. To accurately learn BWEs, previous studies often resort to discriminative approaches which explore semantic proximities between translation equivalents of different languages. Instead, in this paper, we propose a neural generative bilingual autoencoder (NGBAE) which introduces a latent variable to explicitly induce the underlying semantics of bilingual text. In this way, NGBAE is able to obtain better BWEs from more robust bilingual semantics by modeling the semantic distributions of bilingual text. In order to facilitate scalable inference and learning, we utilize deep neural networks to perform the recognition and generation procedures, and then employ stochastic gradient variational Bayes algorithm to optimize them jointly. We validate the proposed model via both extrinsic (cross-lingual document classification and translation probability modeling) and intrinsic (word embedding analysis) evaluations. Experimental results demonstrate the effectiveness of NGBAE on learning BWEs.

The Astrophysical Journal | 2017

ON ESTIMATING FORCE-FREENESS BASED ON OBSERVED MAGNETOGRAMS

Xiaobing Zhang; M. Zhang; Jinsong Su

It is a common practice in the solar physics community to test whether or not measured photospheric or chromospheric vector magnetograms are force-free, using the Maxwell stress as a measure. Some previous studies have suggested that magnetic fields of active regions in the solar chromosphere are close to being force-free whereas there is no consistency among previous studies on whether magnetic fields of active regions in the solar photosphere are force-free or not. Here we use three kinds of representative magnetic fields (analytical force-free solutions, modeled solar-like force-free fields, and observed non-force-free fields) to discuss how measurement issues such as limited field of view (FOV), instrument sensitivity, and measurement error could affect the estimation of force-freeness based on observed magnetograms. Unlike previous studies that focus on discussing the effect of limited FOV or instrument sensitivity, our calculation shows that just measurement error alone can significantly influence the results of estimates of force-freeness, due to the fact that measurement errors in horizontal magnetic fields are usually ten times larger than those in vertical fields. This property of measurement errors, interacting with the particular form of a formula for estimating force-freeness, would result in wrong judgments of the force-freeness: a truly force-free field may be mistakenly estimated as being non-force-free and a truly non-force-free field may be estimated as being force-free. Our analysis calls for caution when interpreting estimates of force-freeness based on measured magnetograms, and also suggests that the true photospheric magnetic field may be further away from being force-free than it currently appears to be.

Explore More