Yang-Sae Moon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yang-Sae Moon is active.

Explore More

Publication

Featured researches published by Yang-Sae Moon.

international conference on management of data | 2002

General match: a subsequence matching method in time-series databases based on generalized windows

Yang-Sae Moon; Kyu-Young Whang; Wook-Shin Han

We generalize the method of constructing windows in subsequence matching. By this generalization, we can explain earlier subsequence matching methods as special cases of a common framework. Based on the generalization, we propose a new subsequence matching method, General Match. The earlier work by Faloutsos et al. (called FRM for convenience) causes a lot of false alarms due to lack of point-filtering effect. Dual Match, recently proposed as a dual approach of FRM, improves performance significantly over FRM by exploiting point filtering effect. However, it has the problem of having a smaller allowable window size---half that of FRM---given the minimum query length. A smaller window increases false alarms due to window size effect. General Match offers advantages of both methods: it can reduce window size effect by using large windows like FRM and, at the same time, can exploit point-filtering effect like Dual Match. General Match divides data sequences into generalized sliding windows (J-sliding windows) and the query sequence into generalized disjoint windows (J-disjoint windows). We formally prove that General Match is correct, i.e., it incurs no false dismissal. We then propose a method of estimating the optimal value of the sliding factor J that minimizes the number of page accesses. Experimental results for real stock data show that, for low selectivities (10-6∼10-4), General Match improves average performance by 117% over Dual Match and by 998% over FRM; for high selectivities (10-3∼10-1), by 45% over Dual Match and by 64% over FRM. The proposed generalization provides an excellent theoretical basis for understanding the underlying mechanisms of subsequence matching.

Information Sciences | 2007

Efficient moving average transform-based subsequence matching algorithms in time-series databases

Yang-Sae Moon; Jinho Kim

Moving average transform is very useful in finding the trend of time-series data by reducing the effect of noise, and has been used in many areas such as econometrics. Previous subsequence matching methods with moving average transform, however, are problematic in that, since they must build multiple indexes in supporting transform of arbitrary order, they incur index overhead both in storage space and in update maintenance. To solve this problem, we propose a single-index approach to subsequence matching that supports moving average transform of arbitrary order in time-series databases. Using the single-index approach, we can reduce both the storage space and the index maintenance overhead. In explaining the single-index approach, we first introduce the notion of poly-order moving average transform by generalizing the original definition of moving average transform. We then formally prove the correctness of poly-order transform-based subsequence matching. We also propose two subsequence matching methods based on poly-order transform that efficiently support moving average transform of arbitrary order. Experimental results for real stock data show that, compared with the sequential scan, our methods improve average performance significantly, by a factor of 22.6-33.6. Also, compared with cases in which an index is built for every moving average order, our methods reduce storage space and maintenance effort significantly while incurring only marginal performance degradation. Our approach entails the additional advantage of being generalized to support many other transforms in addition to moving average transform. Therefore, we believe that our approach will be widely used in many transform-based subsequence matching methods.

data and knowledge engineering | 2010

Scaling-invariant boundary image matching using time-series matching techniques

Yang-Sae Moon; Bum-Soo Kim; Min-Soo Kim; Kyu-Young Whang

In this paper we address the scaling-invariant problem in boundary image matching. Supporting the invariant property against the horizontal or vertical image scaling is very important in boundary image matching to get more intuitive and more accurate matching results. We note that supporting the scaling-invariance is a challenging problem since the number of possible scaling factors is infinite. In this paper we solve this scaling-invariant problem in the time-series domain instead of the image domain. We first define the scaling distance between boundary images and present an interpolation-based method to compute that distance in the time-series domain. We then propose the notion of scaling-invariant distance between boundary images, which is the minimum distance among all possible scaling distances of two images. We use this scaling-invariant distance as the similarity measure in scaling-invariant boundary image matching. Computing the scaling-invariant distance, however, is very difficult or almost impossible, and we instead present how to compute its upper and lower bounds on the given scaling range. Using these lower and upper bounds we also propose divide-and-conquer algorithms to determine the scaling-invariant similarity between boundary images. Finally, we propose sequential and index-based matching methods, respectively, that perform the scaling-invariant boundary image matching correctly. Experimental results show that our scaling-invariant matching gets more intuitive results than the previous (scaling-variant) matching. In addition, compared with the simple sequential scan, our index-based matching method improves performance by one or two orders of magnitude.

Information Sciences | 2008

Similar sequence matching supporting variable-length and variable-tolerance continuous queries on time-series data stream

Hyo-Sang Lim; Kyu-Young Whang; Yang-Sae Moon

We propose a new similar sequence matching method that efficiently supports variable-length and variable-tolerance continuous query sequences on time-series data stream. Earlier methods do not support variable lengths or variable tolerances adequately for continuous query sequences if there are too many query sequences registered to handle in main memory. To support variable-length query sequences, we use the window construction mechanism that divides long sequences into smaller windows for indexing and searching the sequences. To support variable-tolerance query sequences, we present a new notion of intervaled sequences whose individual entries are an interval of real numbers rather than a real number itself. We also propose a new similar sequence matching method based on these notions, and then, formally prove correctness of the method. In addition, we show that our method has the prematching characteristic, which finds future candidates of similar sequences in advance. Experimental results show that our method outperforms the naive one by 2.6-102.1 times and the existing methods in the literature by 1.4-9.8 times over the entire ranges of parameters tested when the query selectivities are low (<32%), which are practically useful in large database applications.

Information Sciences | 2010

Distortion-free predictive streaming time-series matching

Woong Kee Loh; Yang-Sae Moon; Jaideep Srivastava

Efficient processing of streaming time-series generated by remote sensors and mobile devices has become an important research area. As in traditional time-series applications, similarity matching on streaming time-series is also an essential research issue. To obtain more accurate similarity search results in many time-series applications, preprocessing is performed on the time-series before they are compared. The preprocessing removes distortions such as offset translation, amplitude scaling, linear trends, and noise inherent in time-series. In this paper, we propose an algorithm for distortion-free predictive streaming time-series matching. Similarity matching on streaming time-series is saliently different from traditional time-series in that it is not feasible to directly apply the traditional algorithms for streaming time-series. Our algorithm is distortion-free in the sense that it performs preprocessing on streaming time-series to remove offset translation and amplitude scaling distortions at the same time. Our algorithm is also predictive, since it performs streaming time-series matching against the predicted most recent subsequences in the near future, and thus improves search performance. To the best of our knowledge, no streaming time-series matching algorithm currently performs preprocessing and predicts future search results simultaneously.

IEEE Transactions on Knowledge and Data Engineering | 2005

A formal framework for prefetching based on the type-level access pattern in object-relational DBMSs

Wook-Shin Han; Kyu-Young Whang; Yang-Sae Moon

Prefetching is an effective method for minimizing the number of fetches between the client and the server in a database management system. In this paper, we formally define the notion of prefetching. We also formally propose new notions of the type-level access locality and type-level access pattern. The type-level access locality is a phenomenon that repetitive patterns exist in the attributes referenced. The type-level access pattern is a pattern of attributes that are referenced in accessing the objects. We then develop an efficient capturing and prefetching policy based on this formal framework. Existing prefetching methods are based on object-level or page-level access patterns, which consist of object-ids or page-ids of the objects accessed. However, the drawback of these methods is that they work only when exactly the same objects or pages are accessed repeatedly. In contrast, even though the same objects are not accessed repeatedly, our technique effectively prefetches objects if the same attributes are referenced repeatedly, i.e., if there is type-level access locality. Many navigational applications in object-relational database management systems (ORDBMSs) have type-level access locality. Therefore, our technique can be employed in ORDBMSs to effectively reduce the number of fetches, thereby significantly enhancing the performance. We also address issues in implementing the proposed algorithm. We have conducted extensive experiments in a prototype ORDBMS to show effectiveness of our algorithm. Experimental results using the 007 benchmark, a real GIS application, and an XML application show that our technique reduces the number of fetches by orders of magnitude and improves the elapsed time by several factors over on-demand fetching and context-based prefetching, which is a state-of-the-art prefetching method. These results indicate that our approach provides a new paradigm in prefetching that improves performance of navigational applications significantly and is a practical method that can be implemented in commercial ORDBMSs.

Information Systems | 2001

Efficient time-series subsequence matching using duality in constructing windows

Yang-Sae Moon; Kyu-Young Whang; Woong-Kee Loh

Abstract In this paper, we propose a new subsequence matching method, Dual Match. Dual Match exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, 1994, pp. 419–429.) (FRM in short), which divides data sequences into sliding windows and the query sequence into disjoint windows. FRM causes a lot of false alarms (i.e., candidates that do not qualify) by storing minimum bounding rectangles rather than individual points representing windows to save storage space for the index. Dual Match solves this problem by directly storing points without incurring excessive storage overhead. Experimental results show that, in most cases, Dual Match provides large improvement both in false alarms and performance over FRM given the same amount of storage space. In particular, for low selectivities (less than 10−4), Dual Match significantly improves performance up to 430-fold. On the other hand, for high selectivities (more than 10−2), it shows a very minor degradation (less than 29%). For selectivities in between (10−4–10−2), Dual Match shows performance slightly better than that of FRM. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improves performance significantly in large database applications.

international conference on management of data | 2011

A new approach for processing ranked subsequence matching based on ranked union

Wook-Shin Han; Jinsoo Lee; Yang-Sae Moon; Seung-won Hwang; Hwanjo Yu

Ranked subsequence matching finds top-k subsequences most similar to a given query sequence from data sequences. Recently, Han et al. [12] proposed a solution (referred to here as HLMJ) to this problem by using the concept of the minimum distance matching window pair (MDMWP) and a global priority queue. By using the concept of MDMWP, HLMJ can prune many unnecessary accesses to data subsequences using a lower bound distance. However, we notice that HLMJ may incur serious performance overhead for important types of queries. In this paper, we propose a novel systematic framework to solve this problem by viewing ranked subsequence matching as ranked union. Specifically, we propose a notion of the matching subsequence equivalence class (MSEQ) and a novel lower bound called the MSEQ-distance. To completely eliminate the performance problem of HLMJ, we also propose a cost-aware density-based scheduling technique, where we consider both the density and cost of the priority queue. Extensive experimental results with many real datasets show that the proposed algorithm outperforms HLMJ and the adapted PSM [22], a state-of-the-art index-based merge algorithm supporting non-monotonic distance functions, by up to two to three orders of magnitude, respectively.

Information Sciences | 2003

PrefetchGuide: capturing navigational access patterns for prefetching in client/server object-oriented/object-relational DBMSs

Wook-Shin Han; Yang-Sae Moon; Kyu-Young Whang

In prefetching, the objects that are expected to be accessed in the future are fetched from the server to the client in advance. Prefetching reduces the number of round-trips and increases the system performance. To prefetch object effectively, we need to correctly predict the future navigational patterns. In this paper, we propose the PrefetchGuide , a novel data structure that captures the navigational access patterns. We also formally define the notion of the attribute access log set and analyze the navigational access patterns that can be captured by the PrefetchGuide. We then present an prefetching algorithm using the PrefetchGuide. To show effectiveness of our algorithm, we have conducted extensive experiments in a prototype object-relational database management systems (DBMS). The results show that our method significantly outperforms the state-of-the-art prefetching method. These results indicate that our approach provides a practical method that can be implemented in commercial object-oriented/object-relational DBMSs. We believe our method is practically usable for object-oriented programmers and DBMS implementors.

international conference on management of data | 2001

Dynamic buffer allocation in video-on-demand systems

Sang Ho Lee; Kyu-Young Whang; Yang-Sae Moon; Il-Yeol Song

In video-on-demand (VOD) systems, as the size of the buffer allocated to user requests increases, initial latency and memory requirements increase. Hence, the buffer size must be minimized. The existing static buffer allocation scheme, however, determines the buffer size based on the assumption that the system is in the fully loaded state. Thus, when the system is in a partially loaded state, the scheme allocates a buffer larger than necessary to a user request. This paper proposes a dynamic buffer allocation scheme that allocates to user requests buffers of the minimum size in a partially loaded state as well as in the fully loaded state. The inherent difficulty in determining the buffer size in the dynamic buffer allocation scheme is that the size of the buffer currently being allocated is dependent on the number of and the sizes of the buffers to be allocated in the next service period. We solve this problem by the predict-and-enforce strategy, where we predict the number and the sizes of future buffers based on inertia assumptions and enforce these assumptions at runtime. Any violation of these assumptions is resolved by deferring service to the violating new user request until the assumptions are satisfied. Since the size of the current buffer is dependent on the sizes of the future buffers, the size is represented by a recurrence equation. We provide a solution to this equation, which can be computed at the system initialization time for runtime efficiency. We have performed extensive analysis and simulation. The results show that the dynamic buffer allocation scheme reduces initial latency (averaged over the number of user requests in service from one to the maximum capacity) to 1 ÷ 29.4 ≁ 1 ÷ 11.0 of that for the static one and, by reducing the memory requirement, increases the number of concurrent user requests to 2.36 ∼ 3.25 times that of the static one when averaged over the amount of system memory available. These results demonstrate that the dynamic buffer allocation scheme significantly improves the performance and capacity of VOD systems.

Explore More