Bishan Yang
Cornell University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bishan Yang.
empirical methods in natural language processing | 2014
Kai-Wei Chang; Wen-tau Yih; Bishan Yang; Christopher Meek
While relation extraction has traditionally been viewed as a task relying solely on textual data, recent work has shown that by taking as input existing facts in the form of entity-relation triples from both knowledge bases and textual data, the performance of relation extraction can be improved significantly. Following this new paradigm, we propose a tensor decomposition approach for knowledge base embedding that is highly scalable, and is especially suitable for relation extraction. By leveraging relational domain knowledge about entity type information, our learning algorithm is significantly faster than previous approaches and is better able to discover new relations missing from the database. In addition, when applied to a relation extraction task, our approach alone is comparable to several existing systems, and improves the weighted mean average precision of a state-of-theart method by 10 points when used as a subcomponent.
knowledge discovery and data mining | 2009
Bishan Yang; Jian-Tao Sun; Tengjiao Wang; Zheng Chen
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical when a very large amount of data is needed for training multi-label text classifiers. To minimize the human-labeling efforts, we propose a novel multi-label active learning approach which can reduce the required labeled data without sacrificing the classification accuracy. Traditional active learning algorithms can only handle single-label problems, that is, each data is restricted to have one label. Our approach takes into account the multi-label information, and select the unlabeled data which can lead to the largest reduction of the expected model loss. Specifically, the model loss is approximated by the size of version space, and the reduction rate of the size of version space is optimized with Support Vector Machines (SVM). An effective label prediction method is designed to predict possible labels for each unlabeled data point, and the expected loss for multi-label data is approximated by summing up losses on all labels according to the most confident result of label prediction. Experiments on several real-world data sets (all are publicly available) demonstrate that our approach can obtain promising classification result with much fewer labeled data than state-of-the-art methods.
meeting of the association for computational linguistics | 2014
Bishan Yang; Claire Cardie
This paper proposes a novel context-aware method for analyzing sentiment at the level of individual sentences. Most existing machine learning approaches suffer from limitations in the modeling of complex linguistic structures across sentences and often fail to capture nonlocal contextual cues that are important for sentiment interpretation. In contrast, our approach allows structured modeling of sentiment while taking into account both local and global contextual information. Specifically, we encode intuitive lexical and discourse knowledge as expressive constraints and integrate them into the learning of conditional random field models via posterior regularization. The context-aware constraints provide additional power to the CRF model and can guide semi-supervised learning when labeled data is limited. Experiments on standard product review datasets show that our method outperforms the state-of-theart methods in both the supervised and semi-supervised settings.
meeting of the association for computational linguistics | 2017
Bishan Yang; Tom M. Mitchell
This paper focuses on how to take advantage of external knowledge bases (KBs) to improve recurrent neural networks for machine reading. Traditional methods that exploit knowledge from KBs encode knowledge as discrete indicator features. Not only do these features generalize poorly, but they require task-specific feature engineering to achieve good performance. We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading. To effectively integrate background knowledge with information from the currently processed text, our model employs an attention mechanism with a sentinel to adaptively decide whether to attend to background knowledge and which information from KBs is useful. Experimental results show that our model achieves accuracies that surpass the previous state-of-the-art results for both entity extraction and event extraction on the widely used ACE2005 dataset.
Science in China Series F: Information Sciences | 2012
Tengjiao Wang; Ziyu Lin; Bishan Yang; Jun Gao; Allen Huang; Dongqing Yang; Qi Zhang; Shiwei Tang; Jinzhong Niu
With the coming shift to cloud computing, cloud database is emerging to provide database service over the Internet. In the cloud-based environment, data are distributed at Internet scale and the system needs to handle a huge number of user queries simultaneously without delay. How data are distributed among the servers has a crucial impact on the query load distribution and the system response time. In this paper, we propose a market-based control method, called MBA, to achieve query load balance via reasonable data distribution. In MBA, database nodes are treated as traders in a market, and certain market rules are used to intelligently decide data allocation and migration. We built a prototype system and conducted extensive experiments. Experimental results show that the MBA method significantly improves system performance in terms of average query response time and fairness.
north american chapter of the association for computational linguistics | 2016
Bishan Yang; Tom M. Mitchell
Events and entities are closely related; entities are often actors or participants in events and events without entities are uncommon. The interpretation of events and entities is highly contextually dependent. Existing work in information extraction typically models events separately from entities, and performs inference at the sentence level, ignoring the rest of the document. In this paper, we propose a novel approach that models the dependencies among variables of events, entities, and their relations, and performs joint inference of these variables across a document. The goal is to enable access to document-level contextual information and facilitate context-aware predictions. We demonstrate that our approach substantially outperforms the state-of-the-art methods for event extraction as well as a strong baseline for entity extraction.
north american chapter of the association for computational linguistics | 2015
Joonsuk Park; Arzoo Katiyar; Bishan Yang
Park and Cardie (2014) proposed a novel task of automatically identifying appropriate types of support for propositions comprising online user comments, as an essential step toward automated analysis of the adequacy of supporting information. While multiclass Support Vector Machines (SVMs) proved to work reasonably well, they do not exploit the sequential nature of the problem: For instance, verifiable experiential propositions tend to appear together, because a personal narrative typically spans multiple propositions. According to our experiments, however, Conditional Random Fields (CRFs) degrade the overall performance, and we discuss potential fixes to this problem. Nonetheless, we observe that the F1 score with respect to the unverifiable proposition class is increased. Also, semi-supervised CRFs with posterior regularization trained on 75% labeled training data can closely match the performance of a supervised CRF trained on the same training data with the remaining 25% labeled as well.
Communications of The ACM | 2018
Tom M. Mitchell; William W. Cohen; Estevam R. Hruschka; Partha Pratim Talukdar; Bishan Yang; Justin Betteridge; Andrew Carlson; Bhavana Dalvi; Matt Gardner; Bryan Kisiel; Jayant Krishnamurthy; Ni Lao; Kathryn Mazaitis; T. Mohamed; Ndapandula Nakashole; Emmanouil Antonios Platanios; Alan Ritter; Mehdi Samadi; Burr Settles; Richard C. Wang; Derry Tanti Wijaya; Abhinav Gupta; Xi Chen; A. Saparov; M. Greaves; J. Welling
Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a neverending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never-Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidenceweighted beliefs (e.g., servedWith(tea, biscuits)). NELL has also learned millions of features and parameters that enable it to read these beliefs from the web. Additionally, it has learned to reason over these beliefs to infer new beliefs, and is able to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http://rtw.ml.cmu.edu, and followed on Twitter at @CMUNELL.
european conference on information retrieval | 2014
Bishan Yang; Nish Parikh; Gyanit Singh; Neel Sundaresan
Query term deletion is one of the commonly used strategies for query rewriting. In this paper, we study the problem of query term deletion using large-scale e-commerce search logs. Specifically, we focus on queries that do not lead to user clicks and aim to predict a reduced and better query that can lead to clicks by term deletion. Accurate prediction of term deletion can potentially help users recover from poor search results and improve shopping experience. To achieve this, we use various term-dependent and query-dependent measures as features and build a classifier to predict which term is the most likely to be deleted from a given query. Our approach is data-driven. We investigate the large-scale query history and the document collection, verify the usefulness of previously proposed features, and also propose to incorporate the query category information into the term deletion predictors. We observe that training within-category classifiers can result in much better performance than training a unified classifier. We validate our approach using a large collection of query sessions logs from a leading e-commerce site and demonstrate that our approach provides promising performance in query term deletion prediction.
asia pacific web conference | 2008
Tengjiao Wang; Bishan Yang; Jun Gao; Dongqing Yang
Modern large distributed applications, such as mobile communications and banking services, require fast responses to enormous and frequent query requests. This kind of application usually employs in a distributed query-intensive data environment, where the system response time significantly depends on ways of data distribution. Motivated by the efficiency need, we develop two novel strategies: a static data distribution strategy DDH and a dynamic data reallocation strategy DRC to speed up the query response time through load balancing. DDH uses a hash-based heuristic technique to distribute data off-line according to the query history. DRC can reallocate data dynamically at runtime to adapt the changing query patterns in the system. To validate the performance of these two strategies, experiments are conducted using a simulation environment and real customer data. Experimental results show that they both offer favorable performance with the increasing query load of the system.