Jianzhong Qi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jianzhong Qi is active.

Explore More

Publication

Featured researches published by Jianzhong Qi.

international conference on data engineering | 2012

The Min-dist Location Selection Query

Jianzhong Qi; Rui Zhang; Lars Kulik; Dan Lin; Yuan Xue

We propose and study a new type of location optimization problem: given a set of clients and a set of existing facilities, we select a location from a given set of potential locations for establishing a new facility so that the average distance between a client and her nearest facility is minimized. We call this problem the min-dist location selection problem, which has a wide range of applications in urban development simulation, massively multiplayer online games, and decision support systems. We explore two common approaches to location optimization problems and propose methods based on those approaches for solving this new problem. However, those methods either need to maintain an extra index or fall short in efficiency. To address their drawbacks, we propose a novel method (named MND), which has very close performance to the fastest method but does not need an extra index. We provide a detailed comparative cost analysis on the various algorithms. We also perform extensive experiments to evaluate their empirical performance and validate the efficiency of the MND method.

conference on information and knowledge management | 2011

Top-k most influential locations selection

Jin Huang; Zeyi Wen; Jianzhong Qi; Rui Zhang; Jian Chen; Zhen He

We propose and study a new type of facility location selection query, the top-k most influential location selection query. Given a set M of customers and a set F of existing facilities, this query finds k locations from a set C of candidate locations with the largest influence values, where the influence of a candidate location c (c in C) is defined as the number of customers in M who are the reverse nearest neighbors of c. We first present a naive algorithm to process the query. However, the algorithm is computationally expensive and not scalable to large datasets. This motivates us to explore more efficient solutions. We propose two branch and bound algorithms, the Estimation Expanding Pruning (EEP) algorithm and the Bounding Influence Pruning (BIP) algorithm. These algorithms exploit various geometric properties to prune the search space, and thus achieve much better performance than that of the naive algorithm. Specifically, the EEP algorithm estimates the distances to the nearest existing facilities for the customers and the numbers of influenced customers for the candidate locations, and then gradually refines the estimation until the answer set is found, during which distance metric based pruning techniques are used to improve the refinement efficiency. BIP only estimates the numbers of influenced customers for the candidate locations. But it uses the existing facilities to limit the space for searching the influenced customers and achieve a better estimation, which results in an even more efficient algorithm. Extensive experiments conducted on both real and synthetic datasets validate the efficiency of the algorithms.

very large data bases | 2012

A highly optimized algorithm for continuous intersection join queries over moving objects

Rui Zhang; Jianzhong Qi; Dan Lin; Wei Wang; Raymond Chi-Wing Wong

Given two sets of moving objects with nonzero extents, the continuous intersection join query reports every pair of intersecting objects, one from each of the two moving object sets, for every timestamp. This type of queries is important for a number of applications, e.g., in the multi-billion dollar computer game industry, massively multiplayer online games like World of Warcraft need to monitor the intersection among players’ attack ranges and render players’ interaction in real time. The computational cost of a straightforward algorithm or an algorithm adapted from another query type is prohibitive, and answering the query in real time poses a great challenge. Those algorithms compute the query answer for either too long or too short a time interval, which results in either a very large computation cost per answer update or too frequent answer updates, respectively. This observation motivates us to optimize the query processing in the time dimension. In this study, we achieve this optimization by introducing the new concept of time-constrained (TC) processing. Further, TC processing enables a set of effective improvement techniques on traditional intersection join algorithms. Finally, we provide a method to find the optimal value for an important parameter required in our technique, the maximum update interval. As a result, we achieve a highly optimized algorithm for processing continuous intersection join queries on moving objects. With a thorough experimental study, we show that our algorithm outperforms the best adapted existing solution by several orders of magnitude. We also validate the accuracy of our cost model and its effectiveness in optimizing the performance.

very large data bases | 2015

Solving the data sparsity problem in destination prediction

Andy Yuan Xue; Jianzhong Qi; Xing Xie; Rui Zhang; Jin Huang; Yuan Li

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the “data sparsity problem”, i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into “synthesised” trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.

very large data bases | 2014

Processing moving k NN queries using influential neighbor sets

Chuanwen Li; Yu Gu; Jianzhong Qi; Ge Yu; Rui Zhang; Wang Yi

The moving k nearest neighbor query, which computes ones k nearest neighbor set and maintains it while at move, is gaining importance due to the prevalent use of smart mobile devices such as smart phones. Safe region is a popular technique in processing the moving k nearest neighbor query. It is a region where the movement of the query object does not cause the current k nearest neighbor set to change. Processing a moving k nearest neighbor query is a continuing process of checking the validity of the safe region and recomputing it if invalidated. The size of the safe region largely decides the frequency of safe region recomputation and hence query processing efficiency. Existing moving k nearest neighbor algorithms lack efficiency due to either computing small safe regions and have to recompute frequently or computing large safe regions (i.e., an order-k Voronoi cell) with a high cost. In this paper, we take a third approach. Instead of safe regions, we use a small set of safe guarding objects. We prove that, as long as the the current k nearest neighbors are closer to the query object than the safe guarding objects, the current k nearest neighbors stay valid and no recomputation is required. This way, we avoid the high cost of safe region recomputation. We also prove that, the region defined by the safe guarding objects is the largest possible safe region. This means that the recomputation frequency of our method is also minimized. We conduct extensive experiments comparing our method with the state-of-the-art method on both real and synthetic data sets. The results confirm the superiority of our method.

very large data bases | 2014

Real-time continuous intersection joins over large sets of moving objects using graphic processing units

Phillip G. D. Ward; Zhen He; Rui Zhang; Jianzhong Qi

The Multiple Time Bucket Join (MTB-join) algorithm is the state of the art for processing the continuous intersection join (CI-join) query over moving objects. It considerably outperforms alternatives, but still falls short of real-time application performance requirements for large sets of moving objects. In this paper, we achieve real-time performance for the CI-join query over large sets of moving objects by exploiting the computational power of commodity graphics processing units (GPUs). We first analyze how the main characteristics of the MTB-join algorithm make it ill suited to GPUs and identify key challenges in designing efficient GPU-based algorithms for the query. We then address these challenges by developing the multi-layered grid join (MLG-join) algorithm which has the following key features: (i) memory locality friendly indexing, (ii) no dynamic memory allocation, (iii) in-place object updates, (iv) lock-free concurrent updates, and (v) massive parallelism. These features unleash the full potential of the memory bandwidth and parallel processing of GPUs. Furthermore, we conduct a theoretical analysis which can predict the pruning power of the MLG-join algorithm given certain parameter values used in the algorithm. This allows us to select optimal parameter values. Through extensive experimental results, we show that our analysis accurately models the MLG-join algorithm’s sensitivity to parameter values. The proposed MLG-join algorithm outperforms the MTB-join algorithm, and a GPU-based nested-loops join algorithm, by up to two orders of magnitude, and achieves real-time performance for CI-join queries on large sets of moving objects.

Information Systems | 2014

Continuous visible k nearest neighbor query on moving objects

Yanqiu Wang; Rui Zhang; Chuanfei Xu; Jianzhong Qi; Yu Gu; Ge Yu

A visible k nearest neighbor (Vk NN) query retrieves k objects that are visible and nearest to the query object, where “visible” means that there is no obstacle between an object and the query object. Existing studies on the Vk NN query have focused on static data objects. In this paper we investigate how to process the query on moving objects continuously. We propose an effective filtering-and-refinement framework for evaluating this type of queries. We exploit spatial proximity and visibility properties between the query object and data objects to prune search space under this framework. A detailed cost analysis and a comprehensive experimental study are conducted on the proposed framework. The results validate the effectiveness of the pruning techniques and verify the efficiency of the proposed framework. The proposed framework outperforms a straightforward solution by an order of magnitude in terms of both communication and computation costs.

World Wide Web | 2014

The min-dist location selection and facility replacement queries

Jianzhong Qi; Rui Zhang; Yanqiu Wang; Andy Yuan Xue; Ge Yu; Lars Kulik

We propose and study a new type of location optimization problem, the min-dist location selection problem: given a set of clients and a set of existing facilities, we select a location from a given set of potential locations for establishing a new facility, so that the average distance between a client and her nearest facility is minimized. The problem has a wide range of applications in urban development simulation, massively multiplayer online games, and decision support systems. We also investigate a variant of the problem, where we consider replacing (instead of adding) a facility while achieving the same optimization goal. We call this variant the min-dist facility replacement problem. We explore two common approaches to location optimization problems and present methods based on those approaches for solving the min-dist location selection problem. However, those methods either need to maintain an extra index or fall short in efficiency. To address their drawbacks, we propose a novel method (named MND), which has very close performance to the fastest method but does not need an extra index. We then utilize the key idea behind MND to approach the min-dist facility replacement problem, which results in two algorithms names MSND and RID. We provide a detailed comparative cost analysis and conduct extensive experiments on the various algorithms. The results show that MND and RID outperform their competitors by orders of magnitude.

international conference on data engineering | 2015

The safest path via safe zones

Saad Aljubayrin; Jianzhong Qi; Christian S. Jensen; Rui Zhang; Zhen He; Zeyi Wen

We define and study Euclidean and spatial network variants of a new path finding problem: given a set of safe zones, find paths that minimize the distance traveled outside the safe zones. In this problem, the entire space with the exception of the safe zones is unsafe, but passable, and it differs from problems that involve unsafe regions to be strictly avoided. As a result, existing algorithms are not effective solutions to the new problem. To solve the Euclidean variant, we devise a transformation of the continuous data space with safe zones into a discrete graph upon which shortest path algorithms apply. A naive transformation yields a very large graph that is expensive to search. In contrast, our transformation exploits properties of hyperbolas in the Euclidean space to safely eliminate graph edges, thus improving performance without affecting the shortest path results. To solve the spatial network variant, we propose a different graph-to-graph transformation that identifies critical points that serve the same purpose as do the hyperbolas, thus avoiding the creation of extraneous edges. This transformation can be extended to support a weighted version of the problem, where travel in safe zones has non-zero cost. We conduct extensive experiments using both real and synthetic data. The results show that our approaches outperform baseline approaches by more than an order of magnitude in graph construction time, storage space and query response time.

ACM Transactions on Database Systems | 2014

Towards a Painless Index for Spatial Objects

Rui Zhang; Jianzhong Qi; Martin Stradling; Jin Huang

Conventional spatial indexes, represented by the R-tree, employ multidimensional tree structures that are complicated and require enormous efforts to implement in a full-fledged database management system (DBMS). An alternative approach for supporting spatial queries is mapping-based indexing, which maps both data and queries into a one-dimensional space such that data can be indexed and queries can be processed through a one-dimensional indexing structure such as the B+. Mapping-based indexing requires implementing only a few mapping functions, incurring much less effort in implementation compared to conventional spatial index structures. Yet, a major concern about using mapping-based indexes is their lower efficiency than conventional tree structures. In this article, we propose a mapping-based spatial indexing scheme called Size Separation Indexing (SSI). SSI is equipped with a suite of techniques including size separation, data distribution transformation, and more efficient mapping algorithms. These techniques overcome the drawbacks of existing mapping-based indexes and significantly improve the efficiency of query processing. We show through extensive experiments that, for window queries on spatial objects with nonzero extents, SSI has two orders of magnitude better performance than existing mapping-based indexes and competitive performance to the R-tree as a standalone implementation. We have also implemented SSI on top of two off-the-shelf DBMSs, PostgreSQL and a commercial platform, both having R-tree implementation. In this case, SSI is up to two orders of magnitude faster than their provided spatial indexes. Therefore, we achieve a spatial index more efficient than the R-tree in a DBMS implementation that is at the same time easy to implement. This result may upset a common perception that has existed for a long time in this area that the R-tree is the best choice for indexing spatial objects.

Explore More