Chin-Wan Chung | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chin-Wan Chung is active.

Explore More

Publication

Featured researches published by Chin-Wan Chung.

international conference on management of data | 1999

Multi-dimensional selectivity estimation using compressed histogram information

Ju-Hong Lee; Deok-Hwan Kim; Chin-Wan Chung

The database query optimizer requires the estimation of the query selectivity to find the most efficient access plan. For queries referencing multiple attributes from the same relation, we need a multi-dimensional selectivity estimation technique when the attributes are dependent each other because the selectivity is determined by the joint data distribution of the attributes. Additionally, for multimedia databases, there are intrinsic requirements for the multi-dimensional selectivity estimation because feature vectors are stored in multi-dimensional indexing trees. In the 1-dimensional case, a histogram is practically the most preferable. In the multi-dimensional case, however, a histogram is not adequate because of high storage overhead and high error rates. In this paper, we propose a novel approach for the multi-dimensional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. In addition, this approach has the advantage of supporting dynamic data updates by eliminating the overhead for periodical reconstructions of the compressed information. Extensive experimental results show advantages of the proposed approach.

international conference on management of data | 2003

XPRESS: a queriable compression for XML data

Jun-Ki Min; Myung-Jae Park; Chin-Wan Chung

Like HTML, many XML documents are resident on native file systems. Since XML data is irregular and verbose, the disk space and the network bandwidth are wasted. To overcome the verbosity problem, the research on compressors for XML data has been conducted. However, some XML compressors do not support querying compressed data, while other XML compressors which support querying compressed data blindly encode tags and data values using predefined encoding methods. Thus, the query performance on compressed XML data is degraded.In this paper, we propose XPRESS, an XML compressor which supports direct and efficient evaluations of queries on compressed XML data. XPRESS adopts a novel encoding method, called reverse arithmetic encoding, which is intended for encoding label paths of XML data, and applies diverse encoding methods depending on the types of data values. Experimental results with real life data sets show that XPRESS achieves significant improvements on query performance for compressed XML data and reasonable compression ratios. On the average, the query performance of XPRESS is 2.83 times better than that of an existing XML compressor and the compression ratio of XPRESS is 73%.

international conference on data engineering | 2000

Similarity search for multidimensional data sequences

Seok-Lyong Lee; Seok-Ju Chun; Deok-Hwan Kim; Ju-Hong Lee; Chin-Wan Chung

Time series data, which are a series of one dimensional real numbers, have been studied in various database applications. We extend the traditional similarity search methods on time series data to support a multidimensional data sequence, such as a video stream. We investigate the problem of retrieving similar multidimensional data sequences from a large database. To prune irrelevant sequences in a database, we introduce correct and efficient similarity functions. Both data sequences and query sequences are partitioned into subsequences, and each of them is represented by a Minimum Bounding Rectangle (MBR). The query processing is based upon these MBRs, instead of scanning data elements of entire sequences. Our method is designed: (1) to select candidate sequences in a database, and (2) to find the subsequences of a selected sequence, each of which falls under the given threshold. The latter is of special importance in the case of retrieving subsequences from large and complex sequences such as video. By using it, we do not need to browse the whole of the selected video stream, but just browse the sub-streams to find a scene we want. We have performed an extensive experiment on synthetic, as well as real data sequences (a collection of TV news, dramas, and documentary videos) to evaluate our proposed method. The experiment demonstrates that 73-94 percent of irrelevant sequences are pruned using the proposed method, resulting in 16-28 times faster response time compared with that of the sequential search.

international conference on management of data | 2008

Efficient storage scheme and query processing for supply chain management using RFID

Chun-Hee Lee; Chin-Wan Chung

As the size of an RFID tag becomes smaller and the price of the tag gets lower, RFID technology has been applied to a wide range of areas. Recently, RFID has been adopted in the business area such as supply chain management. Since companies can get movement information for products easily using the RFID technology, it is expected to revolutionize supply chain management. However, the amount of RFID data in supply chain management is huge. Therefore, it requires much time to extract valuable information from RFID data for supply chain management. In this paper, we define query templates for tracking queries and path oriented queries to analyze the supply chain. We then propose an effective path encoding scheme to encode the flow information for products. To retrieve the time information for products efficiently, we utilize a numbering scheme used in the XML area. Based on the path encoding scheme and the numbering scheme, we devise a storage scheme to process tracking queries and path oriented queries efficiently. Finally, we propose a method which translates the queries to SQL queries. Experimental results show that our approach can process the queries efficiently. On the average, our approach is about 680 times better than a recent technique in terms of query performance.

international conference on management of data | 2003

QCluster: relevance feedback using adaptive clustering for content-based image retrieval

Deok-Hwan Kim; Chin-Wan Chung

The learning-enhanced relevance feedback has been one of the most active research areas in content-based image retrieval in recent years. However, few methods using the relevance feedback are currently available to process relatively complex queries on large image databases. In the case of complex image queries, the feature space and the distance function of the users perception are usually different from those of the system. This difference leads to the representation of a query with multiple clusters (i.e., regions) in the feature space. Therefore, it is necessary to handle disjunctive queries in the feature space.In this paper, we propose a new content-based image retrieval method using adaptive classification and cluster-merging to find multiple clusters of a complex image query. When the measures of a retrieval method are invariant under linear transformations, the method can achieve the same retrieval quality regardless of the shapes of clusters of a query. Our method achieves the same high retrieval quality regardless of the shapes of clusters of a query since it uses such measures. Extensive experiments show that the result of our method converges to the users true information need fast, and the retrieval quality of our method is about 22% in recall and 20% in precision better than that of the query expansion approach, and about 34% in recall and about 33% in precision better than that of the query point movement approach, in MARS.

international conference on management of data | 2002

Selectivity estimation for spatio-temporal queries to moving objects

Yong-Jin Choi; Chin-Wan Chung

A query optimizer requires selectivity estimation of a query to choose the most efficient access plan. An effective method of selectivity estimation for the future locations of moving objects has not yet been proposed. Existing methods for spatial selectivity estimation do not accurately estimate the selectivity of a query to moving objects, because they do not consider the future locations of moving objects, which change continuously as time passes.In this paper, we propose an effective method for spatio-temporal selectivity estimation to solve this problem. We present analytical formulas which accurately calculate the selectivity of a spatio-temporal query as a function of spatio-temporal information. Extensive experimental results show that our proposed method accurately estimates the selectivity over various queries to spatio-temporal data combining real-life spatial data and synthetic temporal data. When Tiger/lines is used as real-life spatial data, the application of an existing method for spatial selectivity estimation to the estimation of the selectivity of a query to moving objects has the average error ratio from 14% to 85%, whereas our method for spatio-temporal selectivity estimation has the average error ratio from 9% to 23%.

international world wide web conferences | 2012

QUBE: a quick algorithm for updating betweenness centrality

Min-Joong Lee; Jungmin Lee; Jaimie Yejean Park; Ryan Hyun Choi; Chin-Wan Chung

The betweenness centrality of a vertex in a graph is a measure for the participation of the vertex in the shortest paths in the graph. The Betweenness centrality is widely used in network analyses. Especially in a social network, the recursive computation of the betweenness centralities of vertices is performed for the community detection and finding the influential user in the network. Since a social network graph is frequently updated, it is necessary to update the betweenness centrality efficiently. When a graph is changed, the betweenness centralities of all the vertices should be recomputed from scratch using all the vertices in the graph. To the best of our knowledge, this is the first work that proposes an efficient algorithm which handles the update of the betweenness centralities of vertices in a graph. In this paper, we propose a method that efficiently reduces the search space by finding a candidate set of vertices whose betweenness centralities can be updated and computes their betweenness centeralities using candidate vertices only. As the cost of calculating the betweenness centrality mainly depends on the number of vertices to be considered, the proposed algorithm significantly reduces the cost of calculation. The proposed algorithm allows the transformation of an existing algorithm which does not consider the graph update. Experimental results on large real datasets show that the proposed algorithm speeds up the existing algorithm 2 to 2418 times depending on the dataset.

international conference on management of data | 2013

Massive graph triangulation

Xiaocheng Hu; Yufei Tao; Chin-Wan Chung

This paper studies I/O-efficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although previous research has attempted to tackle the challenge, the state-of-the-art solutions rely on a set of crippling assumptions to guarantee good performance. Motivated by this, we develop a new algorithm that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all. The algorithm uses ideas drastically different from all the previous approaches, and outperformed the existing competitors by a factor over an order of magnitude in our extensive experimentation.

Information Processing Letters | 2003

Efficient extraction of schemas for XML documents

Jun-Ki Min; Jae-Yong Ahn; Chin-Wan Chung

In this paper, we present a technique for efficient extraction of concise and accurate schemas for XML documents. By restricting the schema form and applying some heuristic rules, we achieve the efficiency and conciseness. The result of an experiment with real-life DTDs shows that our approach attains high accuracy and is 20 to 200 times faster than existing approaches.

database systems for advanced applications | 2011

A user similarity calculation based on the location for social network services

Min-Joong Lee; Chin-Wan Chung

The online social network services have been growing rapidly over the past few years, and the social network services can easily obtain the locations of users with the recent increasing popularity of the GPS enabled mobile device. In the social network, calculating the similarity between users is an important issue. The user similarity has significant impacts to users, communities and service providers by helping them acquire suitable information effectively. There are numerous factors such as the location, the interest and the gender to calculate the user similarity. The location becomes a very important factor among them, since nowadays the social network services are highly coupled with the mobile device which the user holds all the time. There have been several researches on calculating the user similarity. However, most of them did not consider the location. Even if some methods consider the location, they only consider the physical location of the user which cannot be used for capturing the users intention. We propose an effective method to calculate the user similarity using the semantics of the location. By using the semantics of the location, we can capture the users intention and interest. Moreover, we can calculate the similarity between different locations using the hierarchical location category. To the best of our knowledge, this is the first research that uses the semantics of the location in order to calculate the user similarity. We evaluate the proposed method with a real-world use case: finding the most similar user of a user. We collected more than 251,000 visited locations over 591 users from foursquare. The experimental results show that the proposed method outperforms a popular existing method calculating the user similarity.

Explore More