Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Young-Koo Lee is active.

Publication


Featured researches published by Young-Koo Lee.


Information & Software Technology | 2002

The clustering property of corner transformation for spatial database applications

Ju-Won Song; Kyu-Young Whang; Young-Koo Lee; Min-Jae Lee; Wook-Shin Han; Byung-Kwon Park

Abstract Spatial access methods (SAMs) are often used as clustering indexes in spatial database systems. Therefore, a SAM should have the clustering property both in the index and in the data file. In this paper, we argue that corner transformation preserves the clustering property such that objects having similar sizes and positions in the original space tend to be placed in the same region in the transform space. We then show that SAMs based on corner transformation are able to maintain clustering both in the index and in the data file for storage systems with fixed object positions and propose the MBR-MLGF as an example to implement such an index. In the storage systems with fixed object positions, the inserted objects never move during the operation of the system. Most storage systems currently available adopt this architecture. Extensive experiments comparing with the R∗-tree show that corner transformation indeed preserves the clustering property, and therefore, it can be used as a useful method for spatial query processing. This result reverses the common belief that transformation will adversely affect the clustering and shows that the transformation maintains as good clustering in the transform space as conventional techniques, such as the R∗-tree, do in the original space.


Information Sciences | 1997

A physical database design method for multidimensional file organizations

Jong-Hak Lee; Young-Koo Lee; Kyu-Young Whang; Il-Yeol Song

Abstract This paper presents a physical database design methodology for multidimensional file organizations. Physical database design is the process of determining the optimal configuration of physical files and access structures for a given set of queries. Recently, many multidimensional file organizations have been proposed in the literature. However, there has been no effort toward their physical database design. We first show that the performance of query processing is highly affected by the similarity between the shapes of query regions and page regions in the domain space, and then propose a method for finding the optimal configuration of the multidimensional file by controlling the interval ratio of different axes to achieve the similarity. For performance evaluation, we perform extensive experiments with the multilevel grid file, a multidimensional file organization, using various types of queries and record distributions. The results indicate that our proposed method builds optimal multilevel grid files regardless of the query types and record distributions. When the interval ratio of a two-dimensional query region is 1:1024, the performance of the proposed method is enhanced by as much as 7.5 times over that of the conventional method that has an interval ratio of 1:1 employing the cyclic splitting strategy. The performance is further enhanced for query types having higher interval ratios. The result is significant since interval ratios can be far from 1:1 for many practical applications, especially when different axes have different domains.


Information Sciences | 2003

An aggregation algorithm using a multidimensional file in multidimensional OLAP

Young-Koo Lee; Kyu-Young Whang; Yang-Sae Moon; Il-Yeol Song

Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional files adapting to skewed distributions. In these multidimensional files, the sizes of page regions vary according to the data density in these regions, and the pages that belong to a larger region are accessed multiple times while computing aggregations. To solve this problem, we first present an aggregation computation model that uses the new notions of disjoint-inclusive partition and induced space filling curves . Based on this model, we then present a dynamic aggregation algorithm. Using these notions, the algorithm allows us to maximize the effectiveness of the buffer--we control the page access order in such a way that a page being accessed can reside in the buffer until the next access. We have conducted experiments to show the effectiveness of our approach. Experimental results for a real data set show that the algorithm reduces the number of disk accesses by up to 5.09 times compared with a naive algorithm. The results further show that the algorithm achieves a near optimal performance (i.e., normalized I/O = 1.01) with the total main memory (needed for the buffer and the result table) less than 1.0% of the database size. We believe our work also provides an excellent formal basis for investigating further issues in computing aggregations in MOLAP.


very large data bases | 2002

A one-pass aggregation algorithm with the optimal buffer size in multidimensional OLAP

Young-Koo Lee; Kyu-Young Whang; Yang-Sae Moon; Il-Yeol Song

Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional files adapting to skewed distributions. In these multidimensional files, the sizes of page regions vary according to the data density in these regions, and the pages that belong to a larger region are accessed multiple times while computing aggregations. To solve this problem, we first present an aggregation computation model, called the Disjoint-Inclusive Partition (DIP) computation model, that is the formal basis of our approach. Based on this model, we then present the one-pass aggregation algorithm. This algorithm computes aggregations using the one-pass buffer size, which is the minimum buffer size required for guaranteeing one disk access per page. We prove that our aggregation algorithm is optimal with respect to the one-pass buffer size under our aggregation computation model. Using the DIP computation model allows us to correctly predict the order of accessing data pages in advance. Thus, our algorithm achieves the optimal one-pass buffer size by using a buffer replacement policy, such as Beladys B0 or Toss-Immediate policies, that exploits the page access order computed in advance. Since the page access order is not known a priori in general, these policies have been known to lack practicality despite its theoretic significance. Nevertheless, in this paper, we show that these policies can be effectively used for aggregation computation. n nWe have conducted extensive experiments. We first demonstrate that the one-pass buffer size theoretically derived is indeed correct in real environments. We then compare the performance of the one-pass algorithm with those of other ones. Experimental results for a real data set show that the one-pass algorithm reduces the number of disk accesses by up to 7.31 times compared with a naive algorithm. We also show that the memory requirement of our algorithm for processing the aggregation in one-pass is very small being 0.05%|0.6% of the size of the database. These results indicate that our algorithm is practically usable even for a fairly large database. We believe our work provides an excellent formal basis for investigating further issues in computing aggregations in MOLAP.


conference on information and knowledge management | 2002

Partial rollback in object-oriented/object-relational database management systems

Won-Young Kim; Kyu-Young Whang; Byung Suk Lee; Young-Koo Lee; Ji-Woong Chang

In a database management system (DBMS), partial rollback is an important mechanism for canceling only part of the operations executed in a transaction back to a savepoint. Partial rollback complicates buffer management because it should restore the state of the buffers as well as that of the database. Several relational DBMSs (RDBMSs) currently provide this mechanism using page buffers. However, object-oriented or object-relational DBMSs (OO/ORDBMSs) cannot utilize the partial rollback scheme of RDBMSs as is because, unlike RDBMSs, many of them use a dual buffer consisting of an object buffer and a page buffer. In this paper, we propose a thorough study of partial rollback schemes of OO/ORDBMSs with a dual buffer. First, we classify the partial rollback schemes of OO/ORDBMSs into a single buffer-based scheme and a dual buffer-based scheme by the number of buffers used to process rollback. Next, we propose four alternative partial rollback schemes: a page buffer-based scheme, an object buffer-based scheme, a dual buffer-based scheme using a soft log, and a dual buffer-based scheme using shadows. We then evaluate their performance through simulations. The results show that the dual buffer-based partial rollback scheme using shadows provides the best performance. Partial rollback in OO/ORDBMS has not been addressed in the literature; yet, it is a useful mechanism that must be implemented. The proposed schemes are practical ones that can be implemented in such DBMSs.


Information Processing Letters | 2002

Global lock escalation in database management systems

Ji-Woong Chang; Young-Koo Lee; Kyu-Young Whang

Since database management systems (DBMSs) have limited lock resources, transactions requesting locks beyond the limit must be aborted, degrading the performance abruptly. Lock escalation can be effectively used in such circumstances to alleviate the problem. Many lock escalation methods have been proposed and implemented in commercial DBMSs. However, they have certain problems due to the local nature of their decisions on when to execute lock escalation. In this paper, we propose a new lock escalation method, global lock escalation, that makes such decision globally based on the total number of locks. Through extensive simulation, we show that the global lock escalation method outperforms the existing ones significantly. Especially, we show that the number of concurrent transactions allowable increases by 2-16 times. We believe our method can be easily implemented in the commercial DBMSs enhancing the performance significantly under excessive lock requests.


Information Sciences | 1999

A recovery method supporting user-interactive undo in database management systems

Won-Young Kim; Kyu-Young Whang; Young-Koo Lee; Sang-Wook Kim

User-interactive undo is a kind of recovery facility that allows users to correct mistakes easily by canceling and reexecuting operations that have already been executed. Supporting user-interactive undo is essential for authoring processes in new database applications such as software engineering, hypermedia, and computer-aided design. A partial rollback using savepoints supported by commercial database management systems (DBMSs), which allows only cancellation of executed operations, is a restricted form of user-interactive undo. Although many applications use DBMSs, they have to provide user-interactive undo by themselves due to lack of support from the DBMSs. Since implementation of user-interactive undo is quite complex, it poses significant burden to application programmers. This paper proposes a new recovery method facilitating user-interactive undo in DBMSs. Such a facility relieves the programmers of implementing user-interactive undo themselves in developing DBMS applications. The method guarantees fast rollback of transactions that contain user-interactive undos. It also provides users with the bulk undo operation that restores the database to a pre-determined point in the past. The bulk undo operation resembles partial rollback, but differs in that it allows redo that cancels the bulk undo. Moreover, the performance of the method is comparable to that of the traditional recovery method in spite of added functionalities.


Information Systems | 2005

A formal approach to lock escalation

Ji-Woong Chang; Kyu-Young Whang; Young-Koo Lee; Jae-Heon Yang; Yong-Chul Oh

Since database management systems(DBMSs) have limited lock resources, transactions requesting locks beyond the limit must be aborted, degrading the performance abruptly. Lock escalation is considered a solution to this problem. However, existing lock escalation methods have been designed in an ad hoc manner. So, they do not provide a complete solution. In this paper, we propose a formal model of lock escalation. Using the model, we analyze the roles of lock escalation formally and solve the problems of the existing methods systematically. In particular, we introduce the concept of the unescalatable lock that cannot be escalated due to conflicts. We identify that the unescalatable lock is the major cause of exhausting lock resources. We then analyze the reasons why unescalatable locks are generated and propose a new lock escalation method, adaptive lock escalation, which controls lock escalation based on the number of unescalatable locks. Through extensive simulation, we show that adaptive lock escalation significantly outperforms existing methods reducing the number of aborts and the average response time and increasing the throughput to a great extent. Adaptive lock escalation drastically reduces (more than 10 fold) the number of lock resources required to maintain the same level of throughput and average response time. At the same time, the throughput and average response time when using adaptive lock escalation are rather insensitive to the number of lock resources. Existing methods rely on users to estimate this number accurately at system initialization time. Adaptive lock escalation greatly alleviates this burden.


conference on information and knowledge management | 1999

Transformation-based spatial join

Ju-Won Song; Kyu-Young Whang; Young-Koo Lee; Min-Jae Lee; Sang-Wook Kim

Spatial join finds pairs of spatial objects having a specific spatial relationship in spatial database systems. A number of spatial join algorithms have recently been proposed in the literature. Most of them, however, perform the join in the original space. Joining in the original space has a drawback of dealing with sizes of objects and thus has difficulty in developing a formal algorithm that does not rely on heuristics. In this paper, we propose a spatial join algorithm based on the transformation technique. An object having a size in the two-dimensional original space is transformed into a point in the four-dimensional transform space, and the join is performed on these point objects. This can be easily extended to n-dimensional cases. We show the excellence of the proposed approach through analysis and extensive experiments. The results show that the proposed algorithm has a performance generally better than that of the R*-based algorithm proposed by Brinkhoff et al. This is a strong indicating that corner transformation preserves clustering among objects and that spatial operations can be performed better in the transform space than in the original space. This reverses the common belief that transformation will adversely affect clustering. We believe that our result will provide a new insight towards transformation-based spatial query processing.


database systems for advanced applications | 2006

An efficient algorithm for computing range-groupby queries

Young-Koo Lee; Woong-Kee Loh; Yang-Sae Moon; Kyu-Young Whang; Il-Yeol Song

Aggregation queries for arbitrary regions in an n-dimensional space are powerful tools for data analysis in OLAP. A GROUP BY query in OLAP is very important since it allows us to summarize various trends along with any combination of dimensions. In this paper, we extend the previous aggregation queries by including the GROUP BY clause for arbitrary regions. We call the extension range-groupby queries and present an efficient algorithm for processing them. A typical method of achieving fast response time for aggregation queries is using the prefix-sum array, which stores precomputed partial aggregation values. A naive method for range-groupby queries maintains a prefix-sum array for each combination of the grouping dimensions in an n-dimensional cube, which incurs enormous storage overhead. Our algorithm maintains only one prefix-sum array and still effectively processes range-groupby queries for all possible combinations of multiple grouping dimensions. Compared with the naive method, our algorithm reduces the space overhead by

Collaboration


Dive into the Young-Koo Lee's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yang-Sae Moon

Kangwon National University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wook-Shin Han

Pohang University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge