Tengjiao Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tengjiao Wang is active.

Explore More

Publication

Featured researches published by Tengjiao Wang.

knowledge discovery and data mining | 2009

Effective multi-label active learning for text classification

Bishan Yang; Jian-Tao Sun; Tengjiao Wang; Zheng Chen

Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical when a very large amount of data is needed for training multi-label text classifiers. To minimize the human-labeling efforts, we propose a novel multi-label active learning approach which can reduce the required labeled data without sacrificing the classification accuracy. Traditional active learning algorithms can only handle single-label problems, that is, each data is restricted to have one label. Our approach takes into account the multi-label information, and select the unlabeled data which can lead to the largest reduction of the expected model loss. Specifically, the model loss is approximated by the size of version space, and the reduction rate of the size of version space is optimized with Support Vector Machines (SVM). An effective label prediction method is designed to predict possible labels for each unlabeled data point, and the expected loss for multi-label data is approximated by summing up losses on all labels according to the most confident result of label prediction. Experiments on several real-world data sets (all are publicly available) demonstrate that our approach can obtain promising classification result with much fewer labeled data than state-of-the-art methods.

very large data bases | 2011

Relational approach for shortest path discovery over large graphs

Jun Gao; Ruoming Jin; Jiashuai Zhou; Jeffrey Xu Yu; Xiao Jiang; Tengjiao Wang

With the rapid growth of large graphs, we cannot assume that graphs can still be fully loaded into memory, thus the disk-based graph operation is inevitable. In this paper, we take the shortest path discovery as an example to investigate the technique issues when leveraging existing infrastructure of relational database (RDB) in the graph data management. Based on the observation that a variety of graph search queries can be implemented by iterative operations including selecting frontier nodes from visited nodes, making expansion from the selected frontier nodes, and merging the expanded nodes into the visited ones, we introduce a relational FEM framework with three corresponding operators to implement graph search tasks in the RDB context. We show new features such as window function and merge statement introduced by recent SQL standards can not only simplify the expression but also improve the performance of the FEM framework. In addition, we propose two optimization strategies specific to shortest path discovery inside the FEM framework. First, we take a bi-directional set Dijkstras algorithm in the path finding. The bi-directional strategy can reduce the search space, and set Dijkstras algorithm finds the shortest path in a set-at-a-time fashion. Second, we introduce an index named SegTable to preserve the local shortest segments, and exploit SegTable to further improve the performance. The final extensive experimental results illustrate our relational approach with the optimization strategies achieves high scalability and performance.

international conference on management of data | 2011

Neighborhood-privacy protected shortest distance computing in cloud

Jun Gao; Jeffrey Xu Yu; Ruoming Jin; Jiashuai Zhou; Tengjiao Wang; Dongqing Yang

With the advent of cloud computing, it becomes desirable to utilize cloud computing to efficiently process complex operations on large graphs without compromising their sensitive information. This paper studies shortest distance computing in the cloud, which aims at the following goals: i) preventing outsourced graphs from neighborhood attack, ii) preserving shortest distances in outsourced graphs, iii) minimizing overhead on the client side. The basic idea of this paper is to transform an original graph G into a link graph Gl kept locally and a set of outsourced graphs Go. Each outsourced graph should meet the requirement of a new security model called 1-neighborhood-d-radius. In addition, the shortest distance query can be answered using Gl and Go. Our objective is to minimize the space cost on the client side when both security and utility requirements are satisfied. We devise a greedy method to produce Gl and Go, which can exactly answer the shortest distance queries. We also develop an efficient transformation method to support approximate shortest distance answering under a given additive error bound. The final experimental results illustrate the effectiveness and efficiency of our method.

international conference on data mining | 2006

A Mixed Process Neural Network and its Application to Churn Prediction in Mobile Communications

Guojie Song; Dongqing Yang; Ling Wu; Tengjiao Wang; Shiwei Tang

Churn prediction is an increasingly pressing issue in todays ever-competitive commercial environments, especially in mobile communication arena. In this paper, a mixed process neural network (MPNN) based on Fourier orthogonal base function has been proposed to support churn management, which can deal with both static value and time-varied continuous value simultaneously. To further improve its performance, an optimized network, c-MPNN, has been presented, which adopts Fourier expansion based preprocessing and hidden layer combination techniques to optimize MPNNs structure. Most important of all, our method has been used in real applications in China Mobile. Experiments based on the real datasets also show that our proposed churn prediction method has good maneuverability and performance

international conference on data mining | 2008

SeqStream: Mining Closed Sequential Patterns over Stream Sliding Windows

Lei Chang; Tengjiao Wang; Dongqing Yang; Hua Luan

Previous studies have shown mining closed patterns provides more benefits than mining the complete set of frequent patterns, since closed pattern mining leads to more compact results and more efficient algorithms. It is quite useful in a data stream environment where memory and computation power are major concerns. This paper studies the problem of mining closed sequential patterns over data stream sliding windows. A synopsis structure IST (Inverse Closed Sequence Tree) is designed to keep inverse closed sequential patterns in current window. An efficient algorithm SeqStream is developed to mine closed sequential patterns in stream windows incrementally, and various novel strategies are adopted in SeqStream to prune search space aggressively. Extensive experiments on both real and synthetic data sets show that SeqStream outperforms PrefixSpan, CloSpan and BIDE by a factor of about one to two orders of magnitude.

Science in China Series F: Information Sciences | 2012

MBA: A market-based approach to data allocation and dynamic migration for cloud database

Tengjiao Wang; Ziyu Lin; Bishan Yang; Jun Gao; Allen Huang; Dongqing Yang; Qi Zhang; Shiwei Tang; Jinzhong Niu

With the coming shift to cloud computing, cloud database is emerging to provide database service over the Internet. In the cloud-based environment, data are distributed at Internet scale and the system needs to handle a huge number of user queries simultaneously without delay. How data are distributed among the servers has a crucial impact on the query load distribution and the system response time. In this paper, we propose a market-based control method, called MBA, to achieve query load balance via reasonable data distribution. In MBA, database nodes are treated as traders in a market, and certain market rules are used to intelligently decide data allocation and migration. We built a prototype system and conducted extensive experiments. Experimental results show that the MBA method significantly improves system performance in terms of average query response time and fairness.

Osteoporosis International | 2016

Comparative efficacy of bisphosphonates in short-term fracture prevention for primary osteoporosis: a systematic review with network meta-analyses

Junwen Zhou; Xiang Ma; Tengjiao Wang; Suodi Zhai

SummaryOur network meta-analyses compared the efficacy of different bisphosphonates preventing fractures for primary osteoporosis. By including 36 studies, we found that zoledronic acid seemed the most effective in preventing vertebral fracture, nonvertebral fracture, and any fracture, and alendronate or zoledronic acid seemed the most effective in preventing hip fracture.IntroductionThis study was conducted in order to analyze the available evidence on the efficacy of bisphosphonates for preventing fractures.MethodsWe considered randomized trials comparing any bisphosphonate with other bisphosphonate or placebo. We searched Cochrane Library, Embase, and PubMed and manually searched reference list of relevant articles. Pairwise and network meta-analyses were performed. The primary outcome is vertebral fracture. Secondary outcomes include nonvertebral fracture, hip fracture, wrist fracture, and any fracture.ResultsThirty-six studies were included. Significant difference was found between bisphosphonates for vertebral fracture and nonvertebral fracture (P < 0.0001 and P = 0.04, respectively). Compared with placebo, alendronate, clodronate, ibandronate, minodronate, pamidronate, risedronate, and zoledronic acid significantly prevented vertebral fracture. Zoledronic acid significantly reduced the risk of vertebral fracture, compared with alendronate, clodronate, etidronate, ibandronate, risedronate, and tiludronate (0.65 (0.46, 0.91), 0.53 (0.33, 0.86), 0.45 (0.27, 0.74), 0.52 (0.36, 0.75), 0.59 (0.42, 0.83), and 0.31 (0.21, 0.48), respectively). Compared with etidronate, clodronate and zoledronic acid significantly prevented nonvertebral fracture. Compared with alendronate, zoledronic acid significantly prevented any fracture. The possibility rankings showed that zoledronic ranked first in preventing vertebral fracture, hip fracture, and any fracture, and pamidronate ranked first in preventing nonvertebral fracture and wrist fracture. In the sensitivity analyses, zoledronic acid ranked first in preventing nonvertebral fracture, and alendronate ranked first in preventing hip fracture and wrist fracture.ConclusionZoledronic acid seemed the most effective in preventing vertebral fracture, nonvertebral fracture, and any fracture, and alendronate or zoledronic acid seemed the most effective in preventing hip fracture. Uncertainty still remains and future studies are needed to accurately evaluate the comparative efficacy of bisphosphonates.

conference on information and knowledge management | 2010

Fast top-k simple shortest paths discovery in graphs

Jun Gao; Huida Qiu; Xiao Jiang; Tengjiao Wang; Dongqing Yang

With the wide applications of large scale graph data such as social networks, the problem of finding the top-k shortest paths attracts increasing attention. This paper focuses on the discovery of the top-k simple shortest paths (paths without loops). The well known algorithm for this problem is due to Yen, and the provided worstcase bound O(kn(m + nlogn)), which comes from O(n) times single-source shortest path discovery for each of k shortest paths, remains unbeaten for 30 years, where n is the number of nodes and m is the number of edges. In this paper, we observe that there are shared sub-paths among O(kn) single-source shortest paths. The basic idea behind our method is to pre-compute the shortest paths to the target node, and utilize them to reduce the discovery cost at running time. Specifically, we transform the original graph by encoding the pre-computed paths, and prove that the shortest path discovered over the transformed graph is equivalent to that in the original graph. Most importantly, the path discovery over the transformed graph can be terminated much earlier than before. In addition, two optimization strategies are presented. One is to reduce the total iteration times for shortest path discovery, and the other is to prune the search space in each iteration with an adaptively-determined threshold. Although the worst-case complexity cannot be lowered, our method is proven to be much more efficient in a general case. The final extensive experimental results (on both real and synthetic graphs) also show that our method offers a significant performance improvement over the existing ones.

database and expert systems applications | 2007

MQTree based query rewriting over multiple XML views

Jun Gao; Tengjiao Wang; Dongqing Yang

Using XML views to answer the XML query is an important query optimization strategy especially in the distributed environment. Although many methods have been proposed to handle the single XML view rewriting, they will lead to the redundant computation cost due to the shared paths among different XML views. This paper handles the query rewriting over multiple views by organizing the multiple XML views into a tree called MQTree, in which the shared sub paths among the multiple views have been merged in a top down fashion. In addition, this paper designs a MQTree based query rewriting method. The candidate query rewriting plans are generated over MQTree directly. In order to reduce the validation cost of the candidate query rewriting plans, the preliminary validation is made at the granularity of the path query {//,/,*} over the MQTree first, which prunes the candidate views further and provides the intermediate results for the plans validation at the granularity of the whole tree. The final experiments show the efficiency and effectiveness of our method.

north american chapter of the association for computational linguistics | 2016

pkudblab at SemEval-2016 Task 6 : A Specific Convolutional Neural Network System for Effective Stance Detection

Wan Wei; Xiao Zhang; Xuqin Liu; Wei Chen; Tengjiao Wang

In this paper, we develop a convolutional neural network for stance detection in tweets. According to the official results, our system ranks 1 on subtask B (among 9 teams) and ranks 2 on subtask A (among 19 teams) on the twitter test set of SemEval2016 Task 6. The main contribution of our work is as follows. We design a ”vote scheme” for prediction instead of predicting when the accuracy of validation set reaches its maximum. Besides, we make some improvement on the specific subtasks. For subtask A, we separate datasets into five sub-datasets according to their targets, and train and test five separate models. For subtask B, we establish a two-class training dataset from the official domain corpus, and then modify the softmax layer to perform three-class classification. Our system can be easily re-implemented and optimized for other related tasks.

Explore More