Dong-Wan Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dong-Wan Choi is active.

Explore More

Publication

Featured researches published by Dong-Wan Choi.

very large data bases | 2012

A scalable algorithm for maximizing range sum in spatial databases

Dong-Wan Choi; Chin-Wan Chung; Yufei Tao

This paper investigates the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangular region r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based applications such as finding the best place for a new franchise store with a limited delivery range and finding the most attractive place for a tourist with a limited reachable range. However, the problem has been studied mainly in theory, particularly, in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms which do not guarantee the scalability. In this paper, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem, which is optimal in terms of the I/O complexity. Furthermore, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm. From extensive experimental results, we show that the ExactMaxRS algorithm is two orders of magnitude faster than methods adapted from existing algorithms, and the approximation bound in practice is much better than the theoretical bound of the ApproxMaxCRS algorithm.

very large data bases | 2013

Approximate MaxRS in spatial databases

Yufei Tao; Xiaocheng Hu; Dong-Wan Choi; Chin-Wan Chung

In the maximizing range sum (MaxRS) problem, given (i) a set P of 2D points each of which is associated with a positive weight, and (ii) a rectangle r of specific extents, we need to decide where to place r in order to maximize the covered weight of r - that is, the total weight of the data points covered by r. Algorithms solving the problem exactly entail expensive CPU or I/O cost. In practice, exact answers are often not compulsory in a MaxRS application, where slight imprecision can often be comfortably tolerated, provided that approximate answers can be computed considerably faster. Motivated by this, the present paper studies the (1 - e)-approximate MaxRS problem, which admits the same inputs as MaxRS, but aims instead to return a rectangle whose covered weight is at least (1-e)m*, where m* is the optimal covered weight, and e can be an arbitrarily small constant between 0 and 1. We present fast algorithms that settle this problem with strong theoretical guarantees.

ACM Transactions on Database Systems | 2014

Maximizing Range Sum in External Memory

Dong-Wan Choi; Chin Wan Chung; Yufei Tao

This article studies the MaxRS problem in spatial databases. Given a set O of weighted points and a rectangle r of a given size, the goal of the MaxRS problem is to find a location of r such that the sum of the weights of all the points covered by r is maximized. This problem is useful in many location-based services such as finding the best place for a new franchise store with a limited delivery range and finding the hotspot with the largest number of nearby attractions for a tourist with a limited reachable range. However, the problem has been studied mainly in the theoretical perspective, particularly in computational geometry. The existing algorithms from the computational geometry community are in-memory algorithms that do not guarantee the scalability. In this article, we propose a scalable external-memory algorithm (ExactMaxRS) for the MaxRS problem that is optimal in terms of the I/O complexity. In addition, we propose an approximation algorithm (ApproxMaxCRS) for the MaxCRS problem that is a circle version of the MaxRS problem. We prove the correctness and optimality of the ExactMaxRS algorithm along with the approximation bound of the ApproxMaxCRS algorithm. Furthermore, motivated by the fact that all the existing solutions simply assume that there is no tied area for the best location, we extend the MaxRS problem to a more fundamental problem, namely AllMaxRS, so that all the locations with the same best score can be retrieved. We first prove that the AllMaxRS problem cannot be trivially solved by applying the techniques for the MaxRS problem. Then we propose an output-sensitive external-memory algorithm (TwoPhaseMaxRS) that gives the exact solution for the AllMaxRS problem through two phases. Also, we prove both the soundness and completeness of the result returned from TwoPhaseMaxRS. From extensive experimental results, we show that ExactMaxRS and ApproxMaxCRS are several orders of magnitude faster than methods adapted from existing algorithms, the approximation bound in practice is much better than the theoretical bound of ApproxMaxCRS, and TwoPhaseMaxRS is not only much faster but also more robust than the straightforward extension of ExactMaxRS.

international conference on data engineering | 2016

Finding the minimum spatial keyword cover

Dong-Wan Choi; Jian Pei; Xuemin Lin

The existing works on spatial keyword search focus on finding a group of spatial objects covering all the query keywords and minimizing the diameter of the group. However, we observe that such a formulation may not address what users need in some application scenarios. In this paper, we introduce a novel spatial keyword cover problem (SK-COVER for short), which aims to identify the group of spatio-textual objects covering all keywords in a query and minimizing a distance cost function that leads to fewer proximate objects in the answer set. We prove that SK-COVER is not only NP-hard but also does not allow an approximation better than O(log m) in polynomial time, where m is the number of query keywords. We establish an O(log m)-approximation algorithm, which is asymptotically optimal in terms of the approximability of SK-COVER. Furthermore, we devise effective accessing strategies and pruning rules to improve the overall efficiency and scalability. In addition to our algorithmic results, we empirically show that our approximation algorithm always achieves the best accuracy, and the efficiency of our algorithm is comparable to a state-of-the-art algorithm that is intended for mCK, a problem similar to yet theoretically easier than SK-COVER.

symposium on large spatial databases | 2013

DART: an efficient method for direction-aware bichromatic reverse k nearest neighbor queries

Kyoung-Won Lee; Dong-Wan Choi; Chin-Wan Chung

This paper presents a novel type of queries in spatial databases, called the direction-aware bichromatic reverse k nearest neighbor (DBRkNN) queries,which extend the bichromatic reverse nearest neighbor queries.Given two disjoint sets, P and S, of spatial objects, and a query object q in S, the DBRkNN query returns a subset P′ of P such that k nearest neighbors of each object in P′ include q and each object in P′ has a direction toward q within a pre-defined distance.We formally define the DBRkNN query, and then propose an efficient algorithm, called DART, for processing the DBRkNN query. Our method utilizes a grid-based index to cluster the spatial objects, and the B+-tree to index the direction angle.We adopt a filter-refinement framework that is widely used in many algorithms for reverse nearest neighbor queries. In the filtering step,DART eliminates all the objects that are away from the query object more than the pre-defined distance, or have an invalid direction angle. In the refinement step, remaining objects are verified whether the query object is actually one of the k nearest neighbors of them. From extensive experiments, we show that DART outperforms an R-tree-based naive algorithm in both indexing time and query processing time.

Information Systems | 2017

A K-partitioning algorithm for clustering large-scale spatio-textual data

Dong-Wan Choi; Chin-Wan Chung

The volume of spatio-textual data is drastically increasing in these days, and this makes more and more essential to process such a large-scale spatio-textual dataset. Even though numerous works have been studied for answering various kinds of spatio-textual queries, the analyzing method for spatio-textual data has rarely been considered so far. Motivated by this, this paper proposes a k-means based clustering algorithm specialized for a massive spatio-textual data. One of the strong points of the k-means algorithm lies in its efficiency and scalability, implying that it is appropriate for a large-scale data. However, it is challenging to apply the normal k-means algorithm to spatio-textual data, since each spatio-textual object has non-numeric attributes, that is, textual dimension, as well as numeric attributes, that is, spatial dimension. We address this problem by using the expected distance between a random pair of objects rather than constructing actual centroid of each cluster. Based on our experimental results, we show that the clustering quality of our algorithm is comparable to those of other k-partitioning algorithms that can process spatio-textual data, and its efficiency is superior to those competitors. HighlightsThe problem of clustering large-scale spatio-textual data is firstly studied. It has many real applications like location-based data cleaning.A modified version of the k-means clustering algorithm is developed for spatio-textual data using the expected pairwise distance.Experimentally, our algorithm is not only fast enough to tackle a massive spatio-textual dataset, but also fairly effective in terms of the quality.

intelligent information systems | 2014

DART+: Direction-aware bichromatic reverse k nearest neighbor query processing in spatial databases

Kyoung-Won Lee; Dong-Wan Choi; Chin-Wan Chung

This article presents a novel type of queries in spatial databases, called the direction-aware bichromatic reverse k nearest neighbor(DBRkNN) queries, which extend the bichromatic reverse nearest neighbor queries. Given two disjoint sets, P and S, of spatial objects, and a query object q in S, the DBRkNN query returns a subset P′ of P such that k nearest neighbors of each object in P′ include q and each object in P′ has a direction toward q within a pre-defined distance. We formally define the DBRkNN query, and then propose an efficient algorithm, called DART, for processing the DBRkNN query. Our method utilizes a grid-based index to cluster the spatial objects, and the B+-tree to index the direction angle. We adopt a filter-refinement framework that is widely used in many algorithms for reverse nearest neighbor queries. In the filtering step, DART eliminates all the objects that are away from the query object more than a pre-defined distance, or have an invalid direction angle. In the refinement step, remaining objects are verified whether the query object is actually one of the k nearest neighbors of them. As a major extension of DART, we also present an improved algorithm, called DART+, for DBRkNN queries. From extensive experiments with several datasets, we show that DART outperforms an R-tree-based naive algorithm in both indexing time and query processing time. In addition, our extension algorithm, DART+, also shows significantly better performance than DART.

Information Sciences | 2013

REQUEST+: A framework for efficient processing of region-based queries in sensor networks

Dong-Wan Choi; Chin-Wan Chung

Abstract In wireless sensor networks, individual sensing values are not reliable due to node failures. The effect of these failures can be reduced by using aggregated values for groups of sensor nodes instead of the individual sensing values. However, most existing works have focused on computing the aggregation of all the nodes without grouping. Only a few approaches dealt with the processing of grouped aggregate queries. However, since groups in their approaches are disjoint, some areas which are not covered by groups cannot be considered, even if the areas are relevant to the user’s interest. In this paper, we propose a new type of queries, region-based queries , and a framework to process region-based queries, called REQUEST + . A region in REQUEST + is defined as a maximal set of nodes located within a circle having a diameter specified in the query. To efficiently construct a large number of regions covering the entire monitoring area, we build the SEC (Smallest Enclosing Circle) index. Moreover, in order to process a region-based query, we adapt a clustering-based aggregation method, in which there is a leader node for each region. To minimize the communication cost, we formulate an optimal leader selection problem and prove that it is NP-hard. In addition, we transform the problem into the weighted set-cover problem to utilize the algorithm devised for the problem. Finally, we construct a query-initiated routing tree for the communication between the leader and non-leader nodes. In the experimental results, we show that the result of our region-based query is more reliable than that of the query which is based on individual nodes, and our processing method is more energy-efficient than existing methods for processing grouped aggregate queries.

Geoinformatica | 2016

The direction-constrained k nearest neighbor query

Min-Joong Lee; Dong-Wan Choi; Sang-Yeon Kim; Ha-Myung Park; Sunghee Choi; Chin-Wan Chung

Finding k nearest neighbor objects in spatial databases is a fundamental problem in many geospatial systems and the direction is one of the key features of a spatial object. Moreover, the recent tremendous growth of sensor technologies in mobile devices produces an enormous amount of spatio-directional (i.e., spatially and directionally encoded) objects such as photos. Therefore, an efficient and proper utilization of the direction feature is a new challenge. Inspired by this issue and the traditional k nearest neighbor search problem, we devise a new type of query, called the direction-constrained k nearest neighbor (DCkNN) query. The DCkNN query finds k nearest neighbors from the location of the query such that the direction of each neighbor is in a certain range from the direction of the query. We develop a new index structure called MULTI, to efficiently answer the DCkNN query with two novel index access algorithms based on the cost analysis. Furthermore, our problem and solution can be generalized to deal with spatio-circulant dimensional (such as a direction and circulant periods of time such as an hour, a day, and a week) objects. Experimental results show that our proposed index structure and access algorithms outperform two adapted algorithms from existing kNN algorithms.

international conference on data engineering | 2015

Nearest neighborhood search in spatial databases

Dong-Wan Choi; Chin-Wan Chung

This paper proposes a group version of the nearest neighbor (NN) query, called the nearest neighborhood (NNH) query, which aims to find the nearest group of points, instead of one nearest point. Given a set O of points, a query point q, and a ρ-radius circle C, the NNH query returns the nearest placement of C to q such that there are at least k points enclosed by C. We present a fast algorithm for processing the NNH query based on the incremental retrieval of nearest neighbors using the R-tree structure on O. Our solution includes several techniques, to efficiently maintain sets of retrieved nearest points and identify their validities in terms of the closeness constraint of their points. These techniques are devised from the unique characteristics of the NNH search problem. As a side product, we solve a new geometric problem, called the nearest enclosing circle (NEC) problem, which is of independent interest. We present a linear expected-time algorithm solving the NEC problem using the properties of the NEC similar to those of the smallest enclosing circle. We provide extensive experimental results, which show that our techniques can significantly improve the query performance.

Explore More