Yantao Jia
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yantao Jia.
advances in social networks analysis and mining | 2014
Yantao Jia; Yuanzhuo Wang; Xueqi Cheng; Xiaolong Jin; Jiafeng Guo
With the coming of the era of big data, it is most urgent to establish the knowledge computational engine for the purpose of discovering implicit and valuable knowledge from the huge, rapidly dynamic, and complex network data. In this paper, we first survey the mainstream knowledge computational engines from four aspects and point out their deficiency. To cover these shortages, we propose the open knowledge network (OpenKN), which is a self-adaptive and evolutionable knowledge computational engine for network big data. To the best of our knowledge, this is the first work of designing the end-to-end and holistic knowledge processing pipeline in regard with the network big data. Moreover, to capture the evolutionable computing capability of OpenKN, we present the evolutionable knowledge network for knowledge representation. A case study demonstrates the effectiveness of the evolutionable computing of OpenKN.
advances in social networks analysis and mining | 2014
Hailun Lin; Yantao Jia; Yuanzhuo Wang; Xiaolong Jin; Xiaojing Li; Xueqi Cheng
Populating a knowledge base with new entity mentions extracted from unstructured text can help enhance its coverage and freshness. It naturally consists of two subtasks, namely, fine-grained entity classification and entity linking. Existing studies often focus on one of these two subtasks and they usually populate entity mentions in the same text by implicitly assuming that they are independent. However, these entity mentions are often semantically related to each other and it would be better to populate them into the knowledge base collectively. For solving these problems, in this paper we propose an interdependence graph based and unified collective inference approach, called CIIGA, to populating a knowledge base with collective entities, which can jointly determine the proper locations of all entity mentions in the same text by exploiting their interdependence relationships. Experimental results show that this approach can achieve significant accuracy improvement, as compared to the baseline approach, APOLLO, on the task of knowledge base population with multiple entities.
ACM Transactions on Intelligent Systems and Technology | 2016
Yantao Jia; Yuanzhuo Wang; Xiaolong Jin; Xueqi Cheng
In social networks, predicting a user’s location mainly depends on those of his/her friends, where the key lies in how to select his/her most influential friends. In this article, we analyze the theoretically maximal accuracy of location prediction based on friends’ locations and compare it with the practical accuracy obtained by the state-of-the-art location prediction methods. Upon observing a big gap between the theoretical and practical accuracy, we propose a new strategy for selecting influential friends in order to improve the practical location prediction accuracy. Specifically, several features are defined to measure the influence of the friends on a user’s location, based on which we put forth a sequential random-walk-with-restart procedure to rank the friends of the user in terms of their influence. By dynamically selecting the top N most influential friends of the user per time slice, we develop a temporal-spatial Bayesian model to characterize the dynamics of friends’ influence for location prediction. Finally, extensive experimental results on datasets of real social networks demonstrate that the proposed influential friend selection method and temporal-spatial Bayesian model can significantly improve the accuracy of location prediction.
Journal of Computer Science and Technology | 2015
Yantao Jia; Yuanzhuo Wang; Xueqi Cheng
Link prediction in microblogs by using unsupervised methods has been studied extensively in recent years, which aims to find an appropriate similarity measure between users in the network. However, the measures used by existing work lack a simple way to incorporate the structure of the network and the interactions between users. This leads to the gap between the predictive result and the ground truth value. For example, the F1-measure created by the best method is around 0.2. In this work, we firstly discover the gap and prove its existence. To narrow this gap, we define the retweeting similarity to measure the interactions between users in Twitter, and propose a structural-interaction based matrix factorization model for following-link prediction. Experiments based on the real-world Twitter data show that our model outperforms state-of-the-art methods.
WWW '18 Companion Proceedings of the The Web Conference 2018 | 2018
Yunqi Qiu; Manling Li; Yuanzhuo Wang; Yantao Jia; Xiaolong Jin
Topic entity detection is to find out the main entity asked in a question, which is significant in question answering. Traditional methods ignore the information of entities, especially entity types and their hierarchical structures, restricting the performance. To take full advantage of Knowledge Base(KB) and detect topic entities correctly, we propose a deep neural model to leverage type hierarchy and relations of entities in KB. Experimental results demonstrate the effectiveness of the proposed method.
arXiv: Artificial Intelligence | 2017
Denghui Zhang; Manling Li; Yantao Jia; Yuanzhuo Wang; Xueqi Cheng
Knowledge graph embedding aims to embed entities and relations of knowledge graphs into low-dimensional vector spaces. Translating embedding methods regard relations as the translation from head entities to tail entities, which achieve the state-of-the-art results among knowledge graph embedding methods. However, a major limitation of these methods is the time consuming training process, which may take several days or even weeks for large knowledge graphs, and result in great difficulty in practical applications. In this paper, we propose an efficient parallel framework for translating embedding methods, called ParTrans-X, which enables the methods to be paralleled without locks by utilizing the distinguished structures of knowledge graphs. Experiments on two datasets with three typical translating embedding methods, i.e., TransE [3], TransH [19], and a more efficient variant TransE- AdaGrad [11] validate that ParTrans-X can speed up the training process by more than an order of magnitude.
IEEE Transactions on Computational Social Systems | 2017
Yantao Jia; Yuanzhuo Wang; Xiaolong Jin; Zeya Zhao; Xueqi Cheng
Link inference, i.e., inferring links between vertices in a heterogeneous information network with heterogeneous vertices and edges, has been extensively studied in recent years. So far, many machine learning-based methods have been proposed for link inference, which can be classified into two categories, namely, supervised and unsupervised. Supervised methods perform well but highly rely on feature selection and training data. Although unsupervised methods are inferior to supervised ones, they work in a relatively simple way without considering the class distribution of the training data. In this paper, we investigate the link inference problem in heterogeneous information networks by proposing a knapsack-constrained inference method. Specifically, we integrate dynamic information into the heterogeneous information network and further formalize the link inference problem as a knapsack-like problem. We then solve it by the virtue of a 0–1 knapsack analogous optimization approach and investigate the time complexity of the proposed approach. Finally, experimental results show that the proposed unsupervised method can obtain high performance comparable to supervised method for some cases.
ACM Transactions on The Web | 2017
Yantao Jia; Yuanzhuo Wang; Xiaolong Jin; Hailun Lin; Xueqi Cheng
A knowledge graph is a graph with entities of different types as nodes and various relations among them as edges. The construction of knowledge graphs in the past decades facilitates many applications, such as link prediction, web search analysis, question answering, and so on. Knowledge graph embedding aims to represent entities and relations in a large-scale knowledge graph as elements in a continuous vector space. Existing methods, for example, TransE, TransH, and TransR, learn the embedding representation by defining a global margin-based loss function over the data. However, the loss function is determined during experiments whose parameters are examined among a closed set of candidates. Moreover, embeddings over two knowledge graphs with different entities and relations share the same set of candidates, ignoring the locality of both graphs. This leads to the limited performance of embedding related applications. In this article, a locally adaptive translation method for knowledge graph embedding, called TransA, is proposed to find the loss function by adaptively determining its margin over different knowledge graphs. Then the convergence of TransA is verified from the aspect of its uniform stability. To make the embedding methods up-to-date when new vertices and edges are added into the knowledge graph, the incremental algorithm for TransA, called iTransA, is proposed by adaptively adjusting the optimal margin over time. Experiments on four benchmark data sets demonstrate the superiority of the proposed method, as compared to the state-of-the-art ones.
asia-pacific web conference | 2015
Hailun Lin; Yuanzhuo Wang; Yantao Jia; Jinhua Xiong; Peng Zhang; Xueqi Cheng
Taxonomy matching is an important operation of knowledge base merging. Several matchers for automating taxonomy matching have been proposed and evaluated in the knowledge base community. Studies reveal that there is no single taxonomy matcher suitable for any domain-specific taxonomy mapping, therefore an ensemble of taxonomy matchers is essential. In this paper, we propose taxonomy metamatching, a distributed computing framework for assembling taxonomy matchers and generating an optimal taxonomy mapping. And we introduce TRA, a Threshold Rank Aggregation algorithm for this problem. Experimental results show that TRA outperforms state-of-the-art approaches regardless of domains and scales of taxonomies, which demonstrates that TRA performs good adaptability to taxonomy matching.
Knowledge Based Systems | 2018
Yantao Jia; Yuanzhuo Wang; Xiaolong Jin; Xueqi Cheng
Abstract Knowledge graph embedding aims to represent entities, relations and multi-step relation paths of a knowledge graph as vectors in low-dimensional vector spaces, and supports many applications, such as entity prediction, relation prediction, etc. Existing embedding methods learn the representations of entities, relations, and multi-step relation paths by minimizing a general margin-based loss function shared by all relation paths. This setting fails to consider the differences among different relation paths. In this paper, we propose an embedding method by minimizing a path-specific margin-based loss function for knowledge graph embedding, called PaSKoGE. For each path, it adaptively determines its margin-based loss function by encoding the correlation between relations and multi-step relation paths for any given pair of entities. PaSKoGE outperforms the-state-of-the-art methods.