Lei Sha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lei Sha is active.

Explore More

Publication

Featured researches published by Lei Sha.

meeting of the association for computational linguistics | 2016

RBPB: Regularization-Based Pattern Balancing Method for Event Extraction

Lei Sha; Jing Liu; Chin-Yew Lin; Sujian Li; Baobao Chang; Zhifang Sui

Event extraction is a particularly challenging information extraction task, which intends to identify and classify event triggers and arguments from raw text. In recent works, when determining event types (trigger classification), most of the works are either pattern-only or feature-only. However, although patterns cannot cover all representations of an event, it is still a very important feature. In addition, when identifying and classifying arguments, previous works consider each candidate argument separately while ignoring the relationship between arguments. This paper proposes a Regularization-Based Pattern Balancing Method (RBPB). Inspired by the progress in representation learning, we use trigger embedding, sentence-level embedding and pattern features together as our features for trigger classification so that the effect of patterns and other useful features can be balanced. In addition, RBPB uses a regularization method to take advantage of the relationship between arguments. Experiments show that we achieve results better than current state-of-art equivalents.

empirical methods in natural language processing | 2016

Encoding Temporal Information for Time-Aware Link Prediction

Tingsong Jiang; Tianyu Liu; Tao Ge; Lei Sha; Sujian Li; Baobao Chang; Zhifang Sui

Most existing knowledge base (KB) embedding methods solely learn from time-unknown fact triples but neglect the temporal information in the knowledge base. In this paper, we propose a novel time-aware KB embedding approach taking advantage of the happening time of facts. Specifically, we use temporal order constraints to model transformation between time-sensitive relations and enforce the embeddings to be temporally consistent and more accurate. We empirically evaluate our approach in two tasks of link prediction and triple classification. Experimental results show that our method outperforms other baselines on the two tasks consistently.

empirical methods in natural language processing | 2015

Recognizing Textual Entailment Using Probabilistic Inference

Lei Sha; Sujian Li; Baobao Chang; Zhifang Sui; Tingsong Jiang

Recognizing Text Entailment (RTE) plays an important role in NLP applications including question answering, information retrieval, etc. In recent work, some research explore “deep” expressions such as discourse commitments or strict logic for representing the text. However, these expressions suffer from the limitation of inference inconvenience or translation loss. To overcome the limitations, in this paper, we propose to use the predicate-argument structures to represent the discourse commitments extracted from text. At the same time, with the help of the YAGO knowledge, we borrow the distant supervision technique to mine the implicit facts from the text. We also construct a probabilistic network for all the facts and conduct inference to judge the confidence of each fact for RTE. The experimental results show that our proposed method achieves a competitive result compared to the previous work.

empirical methods in natural language processing | 2016

Capturing Argument Relationship for Chinese Semantic Role Labeling

Lei Sha; Sujian Li; Baobao Chang; Zhifang Sui; Tingsong Jiang

In this paper, we capture the argument relationships for Chinese semantic role labeling task, and improve the task’s performance with the help of argument relationships. We split the relationship between two candidate arguments into two categories: (1) Compatible arguments: if one candidate argument belongs to a given predicate, then the other is more likely to belong to the same predicate; (2) Incompatible arguments: if one candidate argument belongs to a given predicate, then the other is less likely to belong to the same predicate. However, previous works did not explicitly model argument relationships. We use a simple maximum entropy classifier to capture the two categories of argument relationships and test its performance on the Chinese Proposition Bank (CPB). The experiments show that argument relationships is effective in Chinese semantic role labeling task.

China National Conference on Chinese Computational Linguistics | 2016

Recognizing Textual Entailment via Multi-task Knowledge Assisted LSTM

Lei Sha; Sujian Li; Baobao Chang; Zhifang Sui

Recognizing Textual Entailment (RTE) plays an important role in NLP applications like question answering, information retrieval, etc. Most previous works either use classifiers to employ elaborately designed features and lexical similarity or bring distant supervision and reasoning technique into RTE task. However, these approaches are hard to generalize due to the complexity of feature engineering and are prone to cascading errors and data sparsity problems. For alleviating the above problems, some work use LSTM-based recurrent neural network with word-by-word attention to recognize textual entailment. Nevertheless, these work did not make full use of knowledge base (KB) to help reasoning. In this paper, we propose a deep neural network architecture called Multi-task Knowledge Assisted LSTM (MKAL), which aims to conduct implicit inference with the assistant of KB and use predicate-to-predicate attention to detect the entailment between predicates. In addition, our model applies a multi-task architecture to further improve the performance. The experimental results show that our proposed method achieves a competitive result compared to the previous work.

NLPCC | 2014

Event Schema Induction Based on Relational Co-occurrence over Multiple Documents

Tingsong Jiang; Lei Sha; Zhifang Sui

Event schema which comprises a set of related events and partici- pants is of great importance with the development of information extraction (IE) and inducing event schema is prerequisite for IE and natural language gen- eration. vent schema and slots are usually designed manually for traditional IE tasks. Methods for inducing event schemas automatically have been proposed recently. One of the fundamental assumptions in event schema induction is that related events tend to appear together to describe a scenario in natural-language discourse, meanwhile previous work only focused on co-occurrence in one doc- ument. We find that semantically typed relational tuples co-occurrence over multiple documents is helpful to construct event schema. We exploit the rela- tional tuples co-occurrence over multiple documents by locating the key tuple and counting relational tuples, and build a co-occurrence graph which takes ac- count of co-occurrence information over multiple documents. Experiments show that co-occurrence information over multiple documents can help to com- bine similar elements of event schema as well as to alleviate incoherence problems.

meeting of the association for computational linguistics | 2017

A Progressive Learning Approach to Chinese SRL Using Heterogeneous Data.

Qiaolin Xia; Lei Sha; Baobao Chang; Zhifang Sui

Previous studies on Chinese semantic role labeling (SRL) have concentrated on a single semantically annotated corpus. But the training data of single corpus is often limited. Whereas the other existing semantically annotated corpora for Chinese SRL are scattered across different annotation frameworks. But still, Data sparsity remains a bottleneck. This situation calls for larger training datasets, or effective approaches which can take advantage of highly heterogeneous data. In this paper, we focus mainly on the latter, that is, to improve Chinese SRL by using heterogeneous corpora together. We propose a novel progressive learning model which augments the Progressive Neural Network with Gated Recurrent Adapters. The model can accommodate heterogeneous inputs and effectively transfer knowledge between them. We also release a new corpus, Chinese SemBank, for Chinese SRL. Experiments on CPB 1.0 show that our model outperforms state-of-the-art methods.

international conference on computational linguistics | 2016