Is this you? Create Your Porfile

Xiang Lian

University of Texas–Pan American

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiang Lian is active.

Explore More

Publication

Featured researches published by Xiang Lian.

very large data bases | 2004

Reverse kNN search in arbitrary dimensionality

Yufei Tao; Dimitris Papadias; Xiang Lian

Given a point q, a reverse k nearest neighbor (RkNN) query retrieves all the data points that have q as one of their k nearest neighbors. Existing methods for processing such queries have at least one of the following deficiencies: (i) they do not support arbitrary values of k (ii) they cannot deal efficiently with database updates, (iii) they are applicable only to 2D data (but not to higher dimensionality), and (iv) they retrieve only approximate results. Motivated by these shortcomings, we develop algorithms for exact processing of RkNN with arbitrary values of k on dynamic multidimensional datasets. Our methods utilize a conventional data-partitioning index on the dataset and do not require any pre-computation. In addition to their flexibility, we experimentally verify that the proposed algorithms outperform the existing ones even in their restricted focus.

international conference on management of data | 2008

Monochromatic and bichromatic reverse skyline search over uncertain databases

Xiang Lian; Lei Chen

Reverse skyline queries over uncertain databases have many important applications such as sensor data monitoring and business planning. Due to the existence of uncertainty in many real-world data, answering reverse skyline queries accurately and efficiently over uncertain data has become increasingly important. In this paper, we model the probabilistic reverse skyline query on uncertain data, in both monochromatic and bichromatic cases, and propose effective pruning methods to reduce the search space of query processing. Moreover, efficient query procedures have been presented seamlessly integrating the proposed pruning methods. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approach with various experimental settings.

very large data bases | 2009

Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data

Xiang Lian; Lei Chen

Reverse nearest neighbor (RNN) search is very crucial in many real applications. In particular, given a database and a query object, an RNN query retrieves all the data objects in the database that have the query object as their nearest neighbors. Often, due to limitation of measurement devices, environmental disturbance, or characteristics of applications (for example, monitoring moving objects), data obtained from the real world are uncertain (imprecise). Therefore, previous approaches proposed for answering an RNN query over exact (precise) database cannot be directly applied to the uncertain scenario. In this paper, we re-define the RNN query in the context of uncertain databases, namely probabilistic reverse nearest neighbor (PRNN) query, which obtains data objects with probabilities of being RNNs greater than or equal to a user-specified threshold. Since the retrieval of a PRNN query requires accessing all the objects in the database, which is quite costly, we also propose an effective pruning method, called geometric pruning (GP), that significantly reduces the PRNN search space yet without introducing any false dismissals. Furthermore, we present an efficient PRNN query procedure that seamlessly integrates our pruning method. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed GP-based PRNN query processing approach, under various experimental settings.

IEEE Transactions on Knowledge and Data Engineering | 2008

Probabilistic Group Nearest Neighbor Queries in Uncertain Databases

Xiang Lian; Lei Chen

The importance of query processing over uncertain data has recently arisen due to its wide usage in many real-world applications. In the context of uncertain databases, previous works have studied many query types such as nearest neighbor query, range query, top-k query, skyline query, and similarity join. In this paper, we focus on another important query, namely, probabilistic group nearest neighbor (PGNN) query, in the uncertain database, which also has many applications. Specifically, given a set, Q, of query points, a PGNN query retrieves data objects that minimize the aggregate distance (e.g., sum, min, and max) to query set Q. Due to the inherent uncertainty of data objects, previous techniques to answer group nearest neighbor (GNN) query cannot be directly applied to our PGNN problem. Motivated by this, we propose effective pruning methods, namely, spatial pruning and probabilistic pruning, to reduce the PGNN search space, which can be seamlessly integrated into our PGNN query procedure. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approach, in terms of the wall clock time and the speed-up ratio against linear scan.

extending database technology | 2009

Top-k dominating queries in uncertain databases

Xiang Lian; Lei Chen

Due to the existence of uncertain data in a wide spectrum of real applications, uncertain query processing has become increasingly important, which dramatically differs from handling certain data in a traditional database. In this paper, we formulate and tackle an important query, namely probabilistic top-k dominating (PTD) query, in the uncertain database. In particular, a PTD query retrieves k uncertain objects that are expected to dynamically dominate the largest number of uncertain objects. We propose an effective pruning approach to reduce the PTD search space, and present an efficient query procedure to answer PTD queries. Furthermore, approximate PTD query processing and the case where the PTD query is issued from an uncertain query object are also discussed. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed PTD query processing approaches.

very large data bases | 2007

Multidimensional reverse kNN search

Yufei Tao; Dimitris Papadias; Xiang Lian; Xiaokui Xiao

Given a multidimensional point q, a reverse k nearest neighbor (RkNN) query retrieves all the data points that have q as one of their k nearest neighbors. Existing methods for processing such queries have at least one of the following deficiencies: they (i) do not support arbitrary values of k, (ii) cannot deal efficiently with database updates, (iii) are applicable only to 2D data but not to higher dimensionality, and (iv) retrieve only approximate results. Motivated by these shortcomings, we develop algorithms for exact RkNN processing with arbitrary values of k on dynamic, multidimensional datasets. Our methods utilize a conventional data-partitioning index on the dataset and do not require any pre-computation. As a second step, we extend the proposed techniques to continuous RkNN search, which returns the RkNN results for every point on a line segment. We evaluate the effectiveness of our algorithms with extensive experiments using both real and synthetic datasets.

extending database technology | 2008

Dynamic skyline queries in metric spaces

Lei Chen; Xiang Lian

Skyline query is of great importance in many applications, such as multi-criteria decision making and business planning. In particular, a skyline point is a data object in the database whose attribute vector is not dominated by that of any other objects. Previous methods to retrieve skyline points usually assume static data objects in the database (i.e. their attribute vectors are fixed), whereas several recent work focus on skyline queries with dynamic attributes. In this paper, we propose a novel variant of skyline queries, namely metric skyline, whose dynamic attributes are defined in the metric space (i.e. not limited to the Euclidean space). We illustrate an efficient and effective pruning mechanism to answer metric skyline queries through a metric index. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed pruning techniques over the metric index in answering metric skyline queries.

ACM Transactions on Database Systems | 2010

Reverse skyline search in uncertain databases

Xiang Lian; Lei Chen

Reverse skyline queries over uncertain databases have many important applications such as sensor data monitoring and business planning. Due to the wide existence of uncertainty in many real-world data, answering reverse skyline queries accurately and efficiently over uncertain data has become increasingly important. In this article, we formalize the probabilistic reverse skyline query over uncertain data, in both monochromatic and bichromatic cases, and propose effective pruning methods, namely spatial pruning and probabilistic pruning, to reduce the search space of the reverse skyline query processing. Moreover, efficient query procedures have been presented seamlessly integrating the proposed pruning methods. Furthermore, a novel query type, namely Probabilistic Reverse Furthest Skyline (PRFS) query, is proposed and tackled under “the larger, the better” dominance semantics of skyline. Variants of probabilistic reverse skyline have been proposed and tackled, including those that return objects with top-k highest probabilities and that retrieve top-k reverse skylines. Extensive experiments demonstrated the efficiency and effectiveness of our approaches with various experimental settings.

Information Sciences | 2012

Continuous monitoring of skylines over uncertain data streams

Xiaofeng Ding; Xiang Lian; Lei Chen; Hai Jin

Uncertain data are inevitable in many applications due to various factors such as the limitations of measuring equipment and delays in data updates. Although modeling and querying uncertain data have recently attracted considerable attention from the database community, there are still many critical issues to be resolved with respect to conducting advanced analysis on uncertain data. In this paper, we study the execution of the probabilistic skyline query over uncertain data streams. We propose a novel sliding window skyline model where an uncertain tuple may take the probability to be in the skyline at a certain timestamp t. Formally, a Wp-Skyline(p,t) contains all the tuples whose probabilities of becoming skylines are at least p at timestamp t. However, in the stream environment, computing a probabilistic skyline on a large number of uncertain tuples within the sliding window is a daunting task in practice. In order to efficiently calculate Wp-Skyline, we propose an efficient and effective approach, namely the candidate list approach, which maintains lists of candidates that might become skylines in future sliding windows. We also propose algorithms that continuously monitor the newly incoming and expired data to maintain the skyline candidate set incrementally. To further reduce the computation cost of deciding whether or not a candidate tuple belongs to the skyline, we propose an enhanced refinement strategy that is based on a multi-dimensional indexing structure combined with a grouping-and-conquer strategy. To validate the effectiveness of our proposed approach, we conduct extensive experiments on both real and synthetic data sets and make comparisons with basic techniques.

very large data bases | 2015

Reliable diversity-based spatial crowdsourcing by moving workers

Peng Cheng; Xiang Lian; Zhao Chen; Rui Fu; Lei Chen; Jinsong Han; Jizhong Zhao

With the rapid development of mobile devices and the crowdsourcing platforms, the spatial crowdsourcing has attracted much attention from the database community, specifically, spatial crowdsourcing refers to sending a location-based request to workers according to their positions. In this paper, we consider an important spatial crowdsourcing problem, namely reliable diversity-based spatial crowdsourcing (RDB-SC), in which spatial tasks (such as taking videos/photos of a landmark or firework shows, and checking whether or not parking spaces are available) are time-constrained, and workers are moving towards some directions. Our RDB-SC problem is to assign workers to spatial tasks such that the completion reliability and the spatial/temporal diversities of spatial tasks are maximized. We prove that the RDB-SC problem is NP-hard and intractable. Thus, we propose three effective approximation approaches, including greedy, sampling, and divide-and-conquer algorithms. In order to improve the efficiency, we also design an effective cost-model-based index, which can dynamically maintain moving workers and spatial tasks with low cost, and efficiently facilitate the retrieval of RDB-SC answers. Through extensive experiments, we demonstrate the efficiency and effectiveness of our proposed approaches over both real and synthetic datasets.

Explore More