Tao Ge
Peking University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tao Ge.
meeting of the association for computational linguistics | 2014
Wenzhe Pei; Tao Ge; Baobao Chang
Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability to alleviate the burden of manual feature engineering. In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN). By exploiting tag embeddings and tensorbased transformation, MMTNN has the ability to model complicated interactions between tags and context characters. Furthermore, a new tensor factorization approach is proposed to speed up the model and avoid overfitting. Experiments on the benchmark dataset show that our model achieves better performances than previous neural network models and that our model can achieve a competitive performance with minimal feature engineering. Despite Chinese word segmentation being a specific case, MMTNN can be easily generalized and applied to other sequence labeling tasks.
international joint conference on natural language processing | 2015
Wenzhe Pei; Tao Ge; Baobao Chang
Most existing graph-based parsing models rely on millions of hand-crafted features, which limits their generalization ability and slows down the parsing speed. In this paper, we propose a general and effective Neural Network model for graph-based dependency parsing. Our model can automatically learn high-order feature combinations using only atomic features by exploiting a novel activation function tanhcube. Moreover, we propose a simple yet effective way to utilize phrase-level information that is expensive to use in conventional graph-based parsers. Experiments on the English Penn Treebank show that parsers based on our model perform better than conventional graph-based parsers.
international joint conference on natural language processing | 2015
Tao Ge; Wenzhe Pei; Heng Ji; Sujian Li; Baobao Chang; Zhifang Sui
An event chronicle provides people with an easy and fast access to learn the past. In this paper, we propose the first novel approach to automatically generate a topically relevant event chronicle during a certain period given a reference chronicle during another period. Our approach consists of two core components – a timeaware hierarchical Bayesian model for event detection, and a learning-to-rank model to select the salient events to construct the final chronicle. Experimental results demonstrate our approach is promising to tackle this new problem.
empirical methods in natural language processing | 2016
Tao Ge; Lei Cui; Baobao Chang; Sujian Li; Ming Zhou; Zhifang Sui
This paper studies summarizing key information from news streams. We propose simple yet effective models to solve the problem based on a novel and promising representation of text streams – Burst Information Networks (BINets). A BINet can be aware of redundant information, allows global analysis of a text stream, and can be efficiently built and dynamically updated, which perfectly fits the demands of text stream summarization. Extensive experiments show that the BINet-based approaches are not only efficient and can be used in a real-time online summarization setting, but also can generate high-quality summaries, outperforming the state-of-the-art approach.
empirical methods in natural language processing | 2016
Tingsong Jiang; Tianyu Liu; Tao Ge; Lei Sha; Sujian Li; Baobao Chang; Zhifang Sui
Most existing knowledge base (KB) embedding methods solely learn from time-unknown fact triples but neglect the temporal information in the knowledge base. In this paper, we propose a novel time-aware KB embedding approach taking advantage of the happening time of facts. Specifically, we use temporal order constraints to model transformation between time-sensitive relations and enforce the embeddings to be temporally consistent and more accurate. We empirically evaluate our approach in two tasks of link prediction and triple classification. Experimental results show that our method outperforms other baselines on the two tasks consistently.
conference on information and knowledge management | 2013
Tao Ge; Zhifang Sui; Baobao Chang
The automatic assessment of free-text responses of students is a relatively newer task in both computational linguistics and educational technology. The goal of the task is to produce an assessment of student answers to explanation and definition questions typically asked in problems seen in practice exercises or tests. Unlike some conventional methods which assess the student responses based on only information about their corresponding questions, this paper exploits idea of collaborative filtering to analyze student responses and used an effective collaborative filtering model -- feature-based matrix factorization model to deal with this challenge. The experimental results show that our feature-based matrix factorization model outperforms the baseline models and the model with a re-ranking phase can achieve a better and competitive performance -- 63.6% overall accuracy on the Beetle dataset.
NLPCC/ICCPOL | 2016
Tao Ge; Lei Cui; Heng Ji; Baobao Chang; Zhifang Sui
We study an open text mining problem – discovering concept-level event associations from a text stream. We investigate the importance and challenge of this task and propose a novel solution by using event sequential patterns. The proposed approach can discover important event associations implicitly expressed. The discovered event associations are general and useful as knowledge for applications such as event prediction.
asia-pacific web conference | 2015
Tao Ge; Wenzhe Pei; Baobao Chang; Zhifang Sui
The task of distinguishing specific and daily topics is useful in many applications such as event chronicle and timeline generation, and cross-document event coreference resolution. In this paper, we investigate several numeric features that describe useful statistical information for this task, and propose a novel Bayesian model for distinguishing specific and daily topics from a collection of documents based on documents’ content. The proposed Bayesian model exploits mixture of Poisson distributions for modeling probability distributions of the numeric features. The experimental results show that our approach is promising to solve this problem.
international conference natural language processing | 2018
Tao Ge; Lei Cui; Baobao Chang; Zhifang Sui; Furu Wei; Ming Zhou
Mining sub-event relations of major events is an important research problem, which is useful for building event taxonomy, event knowledge base construction, and natural language understanding. To advance the study of this problem, this paper presents a novel dataset called SeRI (Sub-event Relation Inference). SeRI includes 3,917 event articles from English Wikipedia and the annotations of their sub-events. It can be used for training or evaluating a model that mines sub-event relation from encyclopedia-style texts. Based on this dataset, we formally define the task of sub-event relation inference from an encyclopedia, propose an experimental setting and evaluation metrics and evaluate some baseline approaches’ performance on this dataset.
international joint conference on natural language processing | 2015
Tao Ge; Heng Ji; Baobao Chang; Zhifang Sui
We study the problem of predicting tense in Chinese conversations. The unique challenges include: (1) Chinese verbs do not have explicit lexical or grammatical forms to indicate tense; (2) Tense information is often implicitly hidden outside of the target sentence. To tackle these challenges, we first propose a set of novel sentence-level (local) features using rich linguistic resources and then propose a new hypothesis of “One tense per scene” to incorporate scene-level (global) evidence to enhance the performance. Experimental results demonstrate the power of this hybrid approach, which can serve as a new and promising benchmark.