Ziqiang Cao
Peking University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ziqiang Cao.
international joint conference on natural language processing | 2015
Ziqiang Cao; Furu Wei; Sujian Li; Wenjie Li; Ming Zhou; Houfeng Wang
In this paper, we propose the concept of summary prior to define how much a sentence is appropriate to be selected into summary without consideration of its context. Different from previous work using manually compiled documentindependent features, we develop a novel summary system called PriorSum, which applies the enhanced convolutional neural networks to capture the summary prior features derived from length-variable phrases. Under a regression framework, the learned prior features are concatenated with document-dependent features for sentence ranking. Experiments on the DUC generic summarization benchmarks show that PriorSum can discover different aspects supporting the summary prior and outperform state-of-the-art baselines.
meeting of the association for computational linguistics | 2014
Sujian Li; Liang Wang; Ziqiang Cao; Wenjie Li
Previous researches on Text-level discourse parsing mainly made use of constituency structure to parse the whole document into one discourse tree. In this paper, we present the limitations of constituency based discourse parsing and first propose to use dependency structure to directly represent the relations between elementary discourse units (EDUs). The state-of-the-art dependency parsing techniques, the Eisner algorithm and maximum spanning tree (MST) algorithm, are adopted to parse an optimal discourse dependency tree based on the arcfactored model and the large-margin learning techniques. Experiments show that our discourse dependency parsers achieve a competitive performance on text-level discourse parsing.
empirical methods in natural language processing | 2014
Ziqiang Cao; Sujian Li; Heng Ji
Previous work often used a pipelined framework where Chinese word segmentation is followed by term extraction and keyword extraction. Such framework suffers from error propagation and is unable to leverage information in later modules for prior components. In this paper, we propose a four-level Dirichlet Process based model (DP-4) to jointly learn the word distributions from the corpus, domain and document levels simultaneously. Based on the DP-4 model, a sentence-wise Gibbs sampler is adopted to obtain proper segmentation results. Meanwhile, terms and keywords are acquired in the sampling process. Experimental results have shown the effectiveness of our method.
national conference on artificial intelligence | 2015
Ziqiang Cao; Furu Wei; Li Dong; Sujian Li; Ming Zhou
national conference on artificial intelligence | 2015
Ziqiang Cao; Sujian Li; Yang Liu; Wenjie Li; Heng Ji
international conference on computational linguistics | 2016
Ziqiang Cao; Wenjie Li; Sujian Li; Furu Wei; Yanran Li
national conference on artificial intelligence | 2016
Ziqiang Cao; Chengyao Chen; Wenjie Li; Sujian Li; Furu Wei; Ming Zhou
national conference on artificial intelligence | 2016
Ziqiang Cao; Wenjie Li; Sujian Li; Furu Wei
national conference on artificial intelligence | 2018
Ziqiang Cao; Furu Wei; Wenjie Li; Sujian Li
national conference on artificial intelligence | 2017
Ziqiang Cao; Chuwei Luo; Wenjie Li; Sujian Li