Jin Soung Yoo
Indiana University – Purdue University Fort Wayne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jin Soung Yoo.
IEEE Transactions on Knowledge and Data Engineering | 2006
Jin Soung Yoo; Shashi Shekhar
Spatial colocations represent the subsets of features which are frequently located together in geographic space. Colocation pattern discovery presents challenges since spatial objects are embedded in a continuous space, whereas classical data is often discrete. A large fraction of the computation time is devoted to identifying the instances of colocation patterns. We propose a novel joinless approach for efficient colocation pattern mining. The jotnless colocation mining algorithm uses an instance-lookup scheme instead of an expensive spatial or instance join operation for identifying colocation instances. We prove the joinless algorithm is correct and complete in finding colocation rules. We also describe a partial join approach for spatial data which are clustered in neighborhood areas. We provide the algebraic cost models to characterize the performance dominance zones of the joinless method and the partial join method with a current join-based colocation mining method, and compare their computational complexities. In the experimental evaluation, using synthetic and real-world data sets, our methods performed more efficiently than the join-based method and show more scalability in dense data
advances in geographic information systems | 2004
Jin Soung Yoo; Shashi Shekhar; John Smith; Julius P. Kumquat
Spatial co-location patterns represent the subsets of events whose instances are frequently located together in geographic space. We identified the computational bottleneck in the execution time of a current co-location mining algorithm. A large fraction of the join-based co-location miner algorithm is devoted to computing joins to identify instances of candidate co-location patterns. We propose a novel <i>partial-join</i> approach for mining co-location patterns efficiently. It transactionizes continuous spatial data while keeping track of the spatial information not modeled by transactions. It uses a transaction-based <i>Apriori</i> algorithm as a building block and adopts the instance join method for residual instances not identified in transactions. We show that the algorithm is correct and complete in finding all co-location rules which have prevalence and conditional probability above the given thresholds. An experimental evaluation using synthetic datasets and a real dataset shows that our algorithm is computationally more efficient than the join-based algorithm.
advances in geographic information systems | 2003
Shashi Shekhar; Jin Soung Yoo
Nearest neighbor query is one of the most important operations in spatial databases and their application domains, e.g., location-based services, advanced traveler information systems, etc. This paper addresses the problem of finding the in-route nearest neighbor (IRNN) for a query object tuple which consists of a given route with a destination and a current location on it. The IRNN is a facility instance via which the detour from the original route on the way to the destination is smallest. This paper addresses four alternative solution methods. Comparisons among them are presented using an experimental framework. Several experiments using real road map datasets are conducted to examine the behavior of the solutions in terms of three parameters affecting the performance. Our experiments show that the computation costs for all methods except the precomputed zone-based method increase with increases in the road map size and the query route length but decreases with increase in the facility density. The precomputed zone-based method shows the most efficiency when there are no updates on the road map.
international conference on data mining | 2005
Jin Soung Yoo; Shashi Shekhar; Mete Celik
Spatial co-location patterns represent the subsets of features whose instances are frequently located together in geographic space. Co-location pattern discovery presents challenges since the instances of spatial features are embedded in a continuous space and share a variety of spatial relationships. A large fraction of the computation time is devoted to identifying the instances of co-location patterns. We propose a novel join-less approach for co-location pattern mining, which materializes spatial neighbor relationships with no loss of co-location instances and reduces the computational cost of identifying the instances. The join-less co-location mining algorithm is efficient since it uses an instance-lookup scheme instead of an expensive spatial or instance join operation for identifying co-location instances. The experimental evaluations show the join-less algorithm performs more efficiently than a current join-based algorithm and is scalable in dense spatial datasets.
international conference on data mining | 2006
Mete Celik; Shashi Shekhar; James P. Rogers; James A. Shine; Jin Soung Yoo
Mixed-drove spatio-temporal co-occurrence patterns (MDCOPs) represent subsets of object-types that are located together in space and time. Discovering MDCOPs is an important problem with many applications such as identifying tactics in battlefields, games, and predator-prey interactions. However, mining MDCOPs is computationally very expensive because the interest measures are computationally complex, datasets are larger due to the archival history, and the set of candidate patterns is exponential in the number of object-types. We propose a monotonic composite interest measure for discovering MDCOPs and a novel MDCOP mining algorithm. Analytical and experimental results show that the proposed algorithm is correct and complete. Results also show the proposed method is computationally more efficient than naive alternatives.
IEEE Transactions on Knowledge and Data Engineering | 2009
Jin Soung Yoo; Shashi Shekhar
Given a time stamped transaction database and a user-defined reference sequence of interest over time, similarity-profiled temporal association mining discovers all associated item sets whose prevalence variations over time are similar to the reference sequence. The similar temporal association patterns can reveal interesting relationships of data items which co-occur with a particular event over time. Most works in temporal association mining have focused on capturing special temporal regulation patterns such as cyclic patterns and calendar scheme-based patterns. However, our model is flexible in representing interesting temporal patterns using a user-defined reference sequence. The dissimilarity degree of the sequence of support values of an item set to the reference sequence is used to capture how well its temporal prevalence variation matches the reference pattern. By exploiting interesting properties such as an envelope of support time sequence and a lower bounding distance for early pruning candidate item sets, we develop an algorithm for effectively mining similarity-profiled temporal association patterns. We prove the algorithm is correct and complete in the mining results and provide the computational analysis. Experimental results on real data as well as synthetic data show that the proposed algorithm is more efficient than a sequential method using a traditional support-pruning scheme.
international conference on spatial data mining and geographical knowledge services | 2011
Jin Soung Yoo; Mark Bow
In this paper, we present a problem to discover compact co-location patterns without minimum prevalence threshold. A spatial co-location is a set of spatial events being frequently observed together in nearby geographic space. A common framework for mining spatial co-location patterns employs a level-wised search method (like Apriori) to discover co-location patterns, and generates numerous redundant patterns since all of the 2l subsets of each length l event set the algorithms discover are included in the result set. In addition, most works of spatial co-location mining require the specification of a minimum prevalent threshold to find interesting co-location patterns. However, it is difficult for users to decide an appropriate threshold value without prior knowledge of their task-specific spatial data. To solve these problems, we propose a problem to mine top-k closed co-location patterns, where k is the desired number of patterns, and develop an algorithm to efficiently find the interesting patterns. The experiment result shows that the proposed algorithm is effective in computation.
Geoinformatica | 2005
Jin Soung Yoo; Shashi Shekhar
Nearest neighbor query is one of the most important operations in spatial databases and their application domains, such as location-based services and advanced traveler information systems. This paper addresses the problem of finding the in-route nearest neighbor (IRNN) for a query object tuple which consists of a given route with a destination and a current location on it. The IRNN is a facility instance via which the detour from the original route on the way to the destination is smallest. This paper addresses four alternative solution methods. Comparisons among them are presented using an experimental framework. Extensive experiments using real road map datasets are conducted to examine the behaviors of the solutions in terms of five parameters affecting the performance. The overall experiments show that our strategy to reduce the expensive path computations to minimize the response time is reasonable. The spatial distance join-based method always shows better performance with fewer path computations compared to the recursive methods. The computation costs for all methods except the precomputed zone-based method increase with increases in the road map size and the query route length but decrease with increases in the facility density. The precomputed zone-based method shows the most efficiency when there are no updates on the road map.
Data Mining and Knowledge Discovery | 2012
Jin Soung Yoo; Mark Bow
Recently, there has been considerable interest in mining spatial colocation patterns from large spatial datasets. Spatial colocation patterns represent the subsets of spatial events whose instances are often located in close geographic proximity. Most studies of spatial colocation mining require the specification of two parameter constraints to find interesting colocation patterns. One is a minimum prevalent threshold of colocations, and the other is a distance threshold to define spatial neighborhood. However, it is difficult for users to decide appropriate threshold values without prior knowledge of their task-specific spatial data. In this paper, we propose a different framework for spatial colocation pattern mining. To remove the first constraint, we propose the problem of finding N-most prevalent colocated event sets, where N is the desired number of colocated event sets with the highest interest measure values per each pattern size. We developed two alternative algorithms for mining the N-most patterns. They reduce candidate events effectively and use a filter-and-refine strategy for efficiently finding colocation instances from a spatial dataset. We prove the algorithms are correct and complete in finding the N-most prevalent colocation patterns. For the second constraint, a distance threshold for spatial neighborhood determination, we present various methods to estimate appropriate distance bounds from user input data. The result can help an user to set a distance for a conceptualization of spatial neighborhood. Our experimental results with real and synthetic datasets show that our algorithmic design is computationally effective in finding the N-most prevalent colocation patterns. The discovered patterns were different depending on the distance threshold, which shows that it is important to select appropriate neighbor distances.
statistical and scientific database management | 2008
Jin Soung Yoo; Shashi Shekhar
We study the problem of mining all associated itemsets whose prevalence variations are similar to a given reference sequence from temporal databases. The discovered temporal association patterns can reveal interesting relationships of itemsets which co-occur with a particular event over time. A user-defined subset specification which consists of a reference sequence, a similarity function, and a dissimiliarty threshold is used for defining interesting temporal patterns and guiding the similarity search. We develop algorithms with exploring interesting properties for efficiently finding the similar temporal association patterns. Experimental results show that the proposed algorithms are efficient than a naive approach.