Shujian Huang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shujian Huang is active.

Explore More

Publication

Featured researches published by Shujian Huang.

international joint conference on natural language processing | 2015

A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing

Hao Zhou; Yue Zhang; Shujian Huang; Jiajun Chen

Neural probabilistic parsers are attractive for their capability of automatic feature combination and small data sizes. A transition-based greedy neural parser has given better accuracies over its linear counterpart. We propose a neural probabilistic structured-prediction model for transition-based dependency parsing, which integrates search and learning. Beam search is used for decoding, and contrastive learning is performed for maximizing the sentence-level log-likelihood. In standard Penn Treebank experiments, the structured neural parser achieves a 1.8% accuracy improvement upon a competitive greedy neural parser baseline, giving performance comparable to the best linear parser.

meeting of the association for computational linguistics | 2017

Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder

Huadong Chen; Shujian Huang; David Chiang; Jiajun Chen

Most neural machine translation (NMT) models are based on the sequential encoder-decoder framework, which makes no use of syntactic information. In this paper, we improve this model by explicitly incorporating source-side syntactic trees. More specifically, we propose (1) a bidirectional tree encoder which learns both sequential and tree structured representations; (2) a tree-coverage model that lets the attention depend on the source-side syntax. Experiments on Chinese-English translation demonstrate that our proposed models outperform the sequential attentional model as well as a stronger baseline with a bottom-up tree encoder and word coverage.

international joint conference on artificial intelligence | 2017

Deep matrix factorization models for recommender systems

Hong-Jian Xue; Xinyu Dai; Jianbing Zhang; Shujian Huang; Jiajun Chen

Recommender systems usually make personalized recommendation with user-item interaction ratings, implicit feedback and auxiliary information. Matrix factorization is the basic idea to predict a personalized ranking over a set of items for an individual user with the similarities among users and items. In this paper, we propose a novel matrix factorization model with neural network architecture. Firstly, we construct a user-item matrix with explicit ratings and non-preference implicit feedback. With this matrix as the input, we present a deep structure learning architecture to learn a common low dimensional space for the representations of users and items. Secondly, we design a new loss function based on binary cross entropy, in which we consider both explicit ratings and implicit feedback for a better optimization. The experimental results show the effectiveness of both our proposed model and the loss function. On several benchmark datasets, our model outperformed other stateof-the-art methods. We also conduct extensive experiments to evaluate the performance within different experimental settings.

CCL | 2015

Academic Paper Recommendation Based on Heterogeneous Graph

Linlin Pan; Xinyu Dai; Shujian Huang; Jiajun Chen

Digital libraries suffer from the overload problem, which makes the researchers have to spend much time to find relevant papers. Fortunately, recommender system can help to find some relevant papers for researchers automatically according to their browsed papers. Previous paper recommendation methods are either citation-based or content-based. In this paper, we propose a novel recommendation method with a heterogeneous graph in which both citation and content knowledge are included. In detail, a heterogeneous graph is constructed to represent both citation and content information within papers. Then, we apply a graph-based similarity learning algorithm to perform our paper recommendation task. Finally, we evaluate our proposed approach on the ACL Anthology Network data set and conduct an extensive comparison with other recommender approaches. The experimental results demonstrate that our approach outperforms traditional methods.

international conference on asian language processing | 2014

Learning word embeddings from dependency relations

Yinggong Zhao; Shujian Huang; Xinyu Dai; Jianbing Zhang; Jiajun Chen

Continuous-space word representation has demonstrated its effectiveness in many natural language pro-cessing(NLP) tasks. The basic idea for embedding training is to update embedding matrix based on its context. However, such context has been constrained on fixed surrounding words, which we believe are not sufficient to represent the actual relations for given center word. In this work we extend previous approach by learning distributed representations from dependency structure of a sentence which can capture long distance relations. Such context can learn better semantics for words, which is proved on Semantic-Syntactic Word Relationship task. Besides, competitive result is also achieved for dependency embeddings on WordSim-353 task.

meeting of the association for computational linguistics | 2017

Chunk-Based Bi-Scale Decoder for Neural Machine Translation.

Hao Zhou; Zhaopeng Tu; Shujian Huang; Xiaohua Liu; Hang Li; Jiajun Chen

In typical neural machine translation~(NMT), the decoder generates a sentence word by word, packing all linguistic granularities in the same time-scale of RNN. In this paper, we propose a new type of decoder for NMT, which splits the decode state into two parts and updates them in two different time-scales. Specifically, we first predict a chunk time-scale state for phrasal modeling, on top of which multiple word time-scale states are generated. In this way, the target sentence is translated hierarchically from chunks to words, with information in different granularities being leveraged. Experiments show that our proposed model significantly improves the translation performance over the state-of-the-art NMT model.

north american chapter of the association for computational linguistics | 2016

PRIMT: A Pick-Revise Framework for Interactive Machine Translation.

Shanbo Cheng; Shujian Huang; Huadong Chen; Xinyu Dai; Jiajun Chen

Interactive machine translation (IMT) is a method which uses human-computer interactions to improve the quality of MT. Traditional IMT methods employ a left-to-right order for the interactions, which is difficult to directly modify critical errors at the end of the sentence. In this paper, we propose an IMT framework in which the interaction is decomposed into two simple human actions: picking a critical translation error (Pick) and revising the translation (Revise). The picked phrase could be at any position of the sentence, which improves the efficiency of human computer interaction. We also propose automatic suggestion models for the two actions to further reduce the cost of human interaction. Experiment results demonstrate that by interactions through either one of the actions, the translation quality could be significantly improved. Greater gains could be achieved by iteratively performing both actions.

international conference on asian language processing | 2009

Segmenting Long Sentence Pairs for Statistical Machine Translation

Biping Meng; Shujian Huang; Xinyu Dai; Jiajun Chen

In phrase-based statistical machine translation, the knowledge about phrase translation and phrase reordering is learned from the bilingual corpora. However, words may be poorly aligned in long sentence pairs in practice, which will then do harm to the following steps of the translation, such as phrase extraction, etc. A possible solution to this problem is segmenting long sentence pairs into shorter ones. In this paper, we present an effective approach to segmenting sentences based on the modified IBM Translation Model 1. We find that by taking into account the semantics of some words, as well as the length ratio of source and target sentences, the segmentation result is largely improved. We also discuss the effect of length factor to the segmentation result. Experiments show that our approach can improve the BLEU score of a phrase-based translation system by about 0.5 points.

CCL | 2014

An Investigation on Statistical Machine Translation with Neural Language Models

Yinggong Zhao; Shujian Huang; Huadong Chen; Jiajun Chen

Recent work has shown the effectiveness of neural probabilistic language models(NPLMs) in statistical machine translation(SMT) through both reranking the n-best outputs and direct decoding. However there are still some issues remained for application of NPLMs. In this paper we further investigate through detailed experiments and extension of state-of-art NPLMs. Our experiments on large-scale datasets show that our final setting, i.e., decoding with conventional n-gram LMs plus un-normalized feedforward NPLMs extended with word clusters could significantly improve the translation performance by up to averaged 1.1 Bleu on four test datasets, while decoding time is acceptable. And results also show that current NPLMs, including feedforward and RNN still cannot simply replace n-gram LMs for SMT.

international joint conference on natural language processing | 2015

Non-linear Learning for Statistical Machine Translation

Shujian Huang; Huadong Chen; Xinyu Dai; Jiajun Chen

Modern statistical machine translation (SMT) systems usually use a linear combination of features to model the quality of each translation hypothesis. The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner, which might limit the expressive power of the model and lead to a under-fit model on the current data. In this paper, we propose a non-linear modeling for the quality of translation hypotheses based on neural networks, which allows more complex interaction between features. A learning framework is presented for training the non-linear models. We also discuss possible heuristics in designing the network structure which may improve the non-linear learning performance. Experimental results show that with the basic features of a hierarchical phrase-based machine translation system, our method produce translations that are better than a linear model.

Explore More