Xiaoyang Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiaoyang Wang is active.

Explore More

Publication

Featured researches published by Xiaoyang Wang.

extending database technology | 2014

Diversified Spatial Keyword Search On Road Networks

Chengyuan Zhang; Ying Zhang; Wenjie Zhang; Xuemin Lin; Muhammad Aamir Cheema; Xiaoyang Wang

With the increasing pervasiveness of the geo-positioning technologies, there is an enormous amount of spatio-textual objects available in many applications such as location based services and social networks. Consequently, various types of spatial keyword searches which explore both locations and textual descriptions of the objects have been intensively studied by the research communities and commercial organizations. In many important applications (e.g., location based services), the closeness of two spatial objects is measured by the road network distance. Moreover, the result diversification is becoming a common practice to enhance the quality of the search results. Motived by the above facts, in this paper we study the problem of diversified spatial keyword search on road networks which considers both the relevance and the spatial diversity of the results. An efficient signature-based inverted indexing technique is proposed to facilitate the spatial keyword query processing on road networks. Then we develop an efficient diversified spatial keyword search algorithm by taking advantage of spatial keyword pruning and diversity pruning techniques. Comprehensive experiments on real and synthetic data clearly demonstrate the efficiency of our methods.

IEEE Transactions on Knowledge and Data Engineering | 2017

Bring Order into the Samples: A Novel Scalable Method for Influence Maximization

Xiaoyang Wang; Ying Zhang; Wenjie Zhang; Xuemin Lin; Chen Chen

Given a positive integer k, a social network G and a certain propagation model M, influence maximization aims to find a set of k nodes that has the largest influence spread. The state-of-the-art method IMM is based on the reverse influence sampling (RIS) framework. By using the martingale technique, it greatly outperforms the previous methods in efficiency. However, IMM still has limitations in scalability due to the high overhead of deciding a tight sample size. In this paper, instead of spending the effort on deciding a tight sample size, we present a novel bottomk sketch based RIS framework, namely BKRIS, which brings the order of samples into the RIS framework. By applying the sketch technique, we can derive early termination conditions to significantly accelerate the seed set selection procedure. Moreover, we provide several optimization techniques to reduce the cost of generating and processing samples. Finally, we conduct experiments over 10 real social networks to demonstrate the efficiency and effectiveness of the proposed method. Further details are reported in [1].

international acm sigir conference on research and development in information retrieval | 2014

Efficiently identify local frequent keyword co-occurrence patterns in geo-tagged Twitter stream

Xiaoyang Wang; Ying Zhang; Wenjie Zhang; Xuemin Lin

With the prevalence of the geo-position enabled devices and services, a rapidly growing amount of tweets are associated with geo-tags. Consequently, the real time search on geo-tagged Twitter streams has attracted great attentions.In this paper, we advocate the significance of the co-occurrence of keywords for the geo-tagged tweets data analytics, which is overlooked by existing studies. Particularly, we formally introduce the problem of identifying local frequent keyword co-occurrence patterns over the geo-tagged Twitter streams, namely LFP\xspace query. To accommodate the high volume and the rapid updates of the Twitter stream, we develop an inverted KMV sketch (IK\xspace sketch for short) structure to capture the co-occurrence of keywords in limited space. Then efficient algorithms are developed based on IK\xspace sketch to support LFP\xspace queries as well as its variant. The extensive empirical study on real Twitter dataset confirms the effectiveness and efficiency of our approaches.

World Wide Web | 2017

Categorical top-k spatial influence query

Jianye Yang; Wenjie Zhang; Ying Zhang; Xiaoyang Wang; Xuemin Lin

The influence of a spatial facility object depicts the importance of the object in the whole data space. In this paper, we present a novel definition of object influence in applications where objects are of different categories. We study the problem of Spatial Influence Query which considers the contribution of an object in forming functional units consisting of a given set of objects with different categories designated by users. We first show that the problem of spatial influence query is NP-hard with respect to the number of object categories in the functional unit. To tackle the computational hardness, we develop an efficient framework following two main steps, possible participants finding and optimal functional unit computation. Based on this framework, for the first step, novel and efficient pruning techniques are developed based on the nearest neighbor set (NNS) approach. To find the optimal functional unit efficiently, we propose two algorithms, an exact algorithm and an efficient approximate algorithm with performance guarantee. Comprehensive experiments on both real and synthetic datasets demonstrate the effectiveness and efficiency of our techniques.

conference on information and knowledge management | 2015

Range Search on Uncertain Trajectories

Liming Zhan; Ying Zhang; Wenjie Zhang; Xiaoyang Wang; Xuemin Lin

The range search on trajectories is fundamental in a wide spectrum of applications such as environment monitoring and location based services. In practice, a large portion of spatio-temporal data in the above applications is generated with low sampling rate and the uncertainty arises between two subsequent observations of a moving object. To make sense of the uncertain trajectory data, it is critical to properly model the uncertainty of the trajectories and develop efficient range search algorithms on the new model. Assuming uncertain trajectories are modeled by the popular Markov Chains, in this paper we investigate the problem of range search on uncertain trajectories. In particular, we propose a general framework for range search on uncertain trajectories following the filtering-and-refinement paradigm where summaries of uncertain trajectories are constructed to facilitate the filtering process. Moreover, statistics based and partition based filtering techniques are developed to enhance the filtering capabilities. Comprehensive experiments demonstrate the effectiveness and efficiency of our new techniques.

australasian database conference | 2017

DSKQ: A system for efficient processing of diversified spatial-keyword query

Shanqing Jiang; Chengyuan Zhang; Ying Zhang; Wenjie Zhang; Xuemin Lin; Muhammad Aamir Cheema; Xiaoyang Wang

With the rapid development of mobile portable devices and location positioning technologies, massive amount of geo-textual data are being generated by a huge number of web users on various social platforms, such as Facebook and Twitter. Meanwhile, spatial-textual objects that represent Point-of-interests (POIs, e.g., shops, cinema, hotel or restaurant) are increasing pervasively. Consequently, how to retrieve a set of objects that best matches the user’s submitted spatial keyword query (SKQ) has been intensively studied by the research communities and commercial organisations. Existing works only focus on returning the nearest matching objects, although we observe that many real-life applications are now using diversification to enhance the quality of the query results. Thus, existing methods fail to solve the problem of diversified SKQ efficiently. In this demonstration, we introduce DSKQ, a diversified in-memory spatial-keyword query system, which considers both the textual relevance and the spatial diversity of the results processing on road network. We present a prototype of DSKQ which provides users with an application-based interface to explore the diversified spatial-keyword query system.

australasian database conference | 2016

Effective Order Preserving Estimation Method

Chen Chen; Wei Wang; Xiaoyang Wang; Shiyu Yang

Order preserving estimation is an estimation method that can retain the original order of the population parameters of interest. It is an important tool in many applications such as data visualization. In this paper, we focus on the population mean as our primary estimation function, and propose effective query processing strategy that can preserve the estimated order to be correct with probabilistic guarantees. We define the cost function as the number of samples taken for all the groups, and our goal is to make the sample size as small as possible. We compare our methods with state-of-the-art near-optimal algorithm in the literature, and achieve up to \(80\,\%\) reduction in the total sample size.

australasian database conference | 2016

EDMS: A System for Efficient Processing Distance-Aware Influence Maximization

Xiaoyang Wang; Chen Chen; Ying Zhang

As a key problem in viral marketing, influence maximization has been widely studied in the literature. It aims to find a set of k users in a social network, which can maximize the influence spread under a certain propagation model. With the proliferation of geo-social networks, location-aware promotion is becoming more and more necessary in real applications. However, the importance of the distance between users and the promoted locations is underestimated in the existing work. For example, when promoting a local store, the owner may prefer to influence more people that are close to the store instead of people that are far away. In this demonstration, we propose EDMS, a centralized system that efficiently processes the distance-aware influence maximization problem. To meet the online requirements, we combine different pruning strategies and the best first search algorithm to significantly reduce the search space. We present a prototype, which provides users with a web interface to issue queries and visualize the search results in real time.

australasian database conference | 2016

Efficient Maximum Closeness Centrality Group Identification

Chen Chen; Wei Wang; Xiaoyang Wang

As a key concept in the social networks, closeness centrality is widely adopted to measure the importance of a node. Many efficient algorithms are developed in the literature to find the top-k closeness centrality nodes. In most of the previous work, nodes are treated as irrelevant individuals for a top-k ranking. However, in many applications, it is required to find a set of nodes that is the most important as a group. In this paper, we extend the concept of closeness centrality to a set of nodes. We aim to find a set of k nodes that has the largest closeness centrality as a whole. We show that the problem is NP-hard, and prove that the objective function is monotonic and submodular. Therefore, the greedy algorithm can return a result with \(1-1/e\) approximation ratio. In order to handle large graphs, we propose a baseline sampling algorithm (BSA). We further improve the sampling approach by considering the order of samples and reducing the marginal gain update cost, which leads to our order based sampling algorithm (OSA). Finally, extensive experiments on four real world social networks demonstrate the efficiency and effectiveness of the proposed methods.

international conference on management of data | 2015

Optimal Spatial Dominance: An Effective Search of Nearest Neighbor Candidates

Xiaoyang Wang; Ying Zhang; Wenjie Zhang; Xuemin Lin; Muhammad Aamir Cheema

In many domains such as computational geometry and database management, an object may be described by multiple instances (points). Then the distance (or similarity) between two objects is captured by the pair-wise distances among their instances. In the past, numerous nearest neighbor (NN) functions have been proposed to define the distance between objects with multiple instances and to identify the NN object. Nevertheless, considering that a user may not have a specific NN function in mind, it is desirable to provide her with a set of NN candidates. Ideally, the set of NN candidates must include every object that is NN for at least one of the NN functions and must exclude every non-promising object. However, no one has studied the problem of NN candidates computation from this perspective. Although some of the existing works aim at returning a set of candidate objects, they do not focus on the NN functions while computing the candidate objects. As a result, they either fail to include an NN object w.r.t. some NN functions or include a large number of unnecessary objects that have no potential to be the NN regardless of the NN functions. Motivated by this, we classify the existing NN functions for objects with multiple instances into three families by characterizing their key features. Then, we advocate three spatial dominance operators to compute NN candidates where each operator is optimal w.r.t. different coverage of NN functions. Efficient algorithms are proposed for the dominance check and corresponding NN candidates computation. Extensive empirical study on real and synthetic datasets shows that our proposed operators can significantly reduce the number of NN candidates. The comprehensive performance evaluation demonstrates the efficiency of our computation techniques.

Explore More