Meishan Zhang
Harbin Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Meishan Zhang.
empirical methods in natural language processing | 2015
Meishan Zhang; Yue Zhang; Duy Tin Vo
Open domain targeted sentiment is the joint information extraction task that finds target mentions together with the sentiment towards each mention from a text corpus. The task is typically modeled as a sequence labeling problem, and solved using state-of-the-art labelers such as CRF. We empirically study the effect of word embeddings and automatic feature combinations on the task by extending a CRF baseline using neural networks, which have demonstrated large potentials for sentiment analysis. Results show that the neural model can give better results by significantly increasing the recall. In addition, we propose a novel integration of neural and discrete features, which combines their relative advantages, leading to significantly higher results compared to both baselines.
meeting of the association for computational linguistics | 2014
Meishan Zhang; Yue Zhang; Wanxiang Che; Ting Liu
Recent work on Chinese analysis has led to large-scale annotations of the internal structures of words, enabling characterlevel analysis of Chinese syntactic structures. In this paper, we investigate the problem of character-level Chinese dependency parsing, building dependency trees over characters. Character-level information can benefit downstream applications by offering flexible granularities for word segmentation while improving wordlevel dependency parsing accuracies. We present novel adaptations of two major shift-reduce dependency parsing algorithms to character-level parsing. Experimental results on the Chinese Treebank demonstrate improved performances over word-based parsing methods.
meeting of the association for computational linguistics | 2016
Meishan Zhang; Yue Zhang; Guohong Fu
Character-based and word-based methods are two main types of statistical models for Chinese word segmentation, the former exploiting sequence labeling models over characters and the latter typically exploiting a transition-based model, with the advantages that word-level features can be easily utilized. Neural models have been exploited for character-based Chinese word segmentation, giving high accuracies by making use of external character embeddings, yet requiring less feature engineering. In this paper, we study a neural model for word-based Chinese word segmentation, by replacing the manuallydesigned discrete features with neural features in a word-based segmentation framework. Experimental results demonstrate that word features lead to comparable performances to the best systems in the literature, and a further combination of discrete and neural features gives top accuracies.
empirical methods in natural language processing | 2015
Meishan Zhang; Yue Zhang
We investigate a combination of a traditional linear sparse feature model and a multi-layer neural network model for deterministic transition-based dependency parsing, by integrating the sparse features into the neural model. Correlations are drawn between the hybrid model and previous work on integrating word embedding features into a discrete linear model. By analyzing the results of various parsers on web-domain parsing, we show that the integrated model is a better way to combine traditional and embedding features compared with previous methods.
international joint conference on natural language processing | 2015
Rui Sun; Yue Zhang; Meishan Zhang; Donghong Ji
We propose an event-driven model for headline generation. Given an input document, the system identifies a key event chain by extracting a set of structural events that describe them. Then a novel multi-sentence compression algorithm is used to fuse the extracted events, generating a headline for the document. Our model can be viewed as a novel combination of extractive and abstractive headline generation, combining the advantages of both methods using event structures. Standard evaluation shows that our model achieves the best performance compared with previous state-of-the-art systems.
empirical methods in natural language processing | 2015
Tao Qian; Yue Zhang; Meishan Zhang; Yafeng Ren; Donghong Ji
We propose a transition-based model for joint word segmentation, POS tagging and text normalization. Different from previous methods, the model can be trained on standard text corpora, overcoming the lack of annotated microblog corpora. To evaluate our model, we develop an annotated corpus based on microblogs. Experimental results show that our joint model can help improve the performance of word segmentation on microblogs, giving an error reduction in segmentation accuracy of 12.02%, compared to the traditional approach.
conference of the european chapter of the association for computational linguistics | 2014
Meishan Zhang; Yue Zhang; Wanxiang Che; Ting Liu
We report an empirical investigation on type-supervised domain adaptation for joint Chinese word segmentation and POS-tagging, making use of domainspecific tag dictionaries and only unlabeled target domain data to improve target-domain accuracies, given a set of annotated source domain sentences. Previous work on POS-tagging of other languages showed that type-supervision can be a competitive alternative to tokensupervision, while semi-supervised techniques such as label propagation are important to the effectiveness of typesupervision. We report similar findings using a novel approach for joint Chinese segmentation and POS-tagging, under a cross-domain setting. With the help of unlabeled sentences and a lexicon of 3,000 words, we obtain 33% error reduction in target-domain tagging. In addition, combined typeand token-supervision can lead to improved cost-effectiveness.
international conference on computational linguistics | 2014
Meishan Zhang; Yue Zhang; Wanxiang Che; Ting Liu
Chinese grammar engineering has been a much debated task. Whilst semantic information has been reconed crucial for Chinese syntactic analysis and downstream applications, existing Chinese treebanks lack a consistent and strict sentential semantic formalism. In this paper, we introduce a semantics oriented grammar for Chinese, designed to provide basic supports for tasks such as automatic semantic parsing and sentence generation. It has a directed acyclic graph structure with a simple yet expressive label set, and leverages elementary predication to support logical form conversion. To our knowledge, it is the first Chinese grammar representation capable of direct transformation into logical forms.
meeting of the association for computational linguistics | 2014
Yue Zhang; Meishan Zhang; Ting Liu
This tutorial discusses a framework for incremental left-to-right structured predication, which makes use of global discriminative learning and beam-search decoding. The method has been applied to a wide range of NLP tasks in recent years, and achieved competitive accuracies and efficiencies. We give an introduction to the algorithms and efficient implementations, and discuss their applications to a range of NLP tasks.
international conference on asian language processing | 2012
Meishan Zhang; Wanxiang Che; Yanqiu Shao; Ting Liu
We address the problem of Chinese semantic dependency parsing. Dependency parsing is traditionally oriented to syntax analysis, which we denote by syntactic dependency parsing to distinguish it from semantic dependency parsing. In this paper, firstly we compare Chinese semantic dependency parsing and syntactic dependency parsing systematically, showing that syntactic dependency parsing can potentially improve the performance of semantic dependency parsing. Thus then we suggest an approach based on quasi-synchronous grammar to incorporate the auto-parsed syntactic dependency tree into semantic dependency parsing. We conduct experiments on the Chinese semantic dependency parsing corpus of SemEval-2012. Finally we achieve 65.25% LAS on test corpus, gaining increases of 2.45% compared to the top result of 62.80% in SemEval-2012.