Vit Niennattrakul
Chulalongkorn University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vit Niennattrakul.
international conference on data mining | 2010
Doruk Sart; Abdullah Mueen; Walid A. Najjar; Eamonn J. Keogh; Vit Niennattrakul
Many time series data mining problems require subsequence similarity search as a subroutine. Dozens of similarity/distance measures have been proposed in the last decade and there is increasing evidence that Dynamic Time Warping (DTW) is the best measure across a wide range of domains. Given DTW’s usefulness and ubiquity, there has been a large community-wide effort to mitigate its relative lethargy. Proposed speedup techniques include early abandoning strategies, lower-bound based pruning, indexing and embedding. In this work we argue that we are now close to exhausting all possible speedup from software, and that we must turn to hardware-based solutions. With this motivation, we investigate both GPU (Graphics Processing Unit) and FPGA (Field Programmable Gate Array) based acceleration of subsequence similarity search under the DTW measure. As we shall show, our novel algorithms allow GPUs to achieve two orders of magnitude speedup and FPGAs to produce four orders of magnitude speedup. We conduct detailed case studies on the classification of astronomical observations and demonstrate that our ideas allow us to tackle problems that would be untenable otherwise.
multimedia and ubiquitous engineering | 2007
Vit Niennattrakul; Chotirat Ann Ratanamahatana
After the generation of multimedia data turned digital, an explosion of interest in their data storage, retrieval, and processing has drastically increased. This includes videos, images, and audios, where we now have higher expectations in exploiting these data at hands. Typical manipulations are in some forms of video/image/audio processing, including automatic speech recognition, which require fairly large amount of storage and are computationally intensive. In our recent work, we have demonstrated the utility of time series representation in the task of clustering multimedia data using k-medoids method, which allows considerable amount of reduction in computational effort and storage space. However, k- means is a much more generic clustering method when Euclidean distance is used. In this work, we will demonstrate that unfortunately, k-means clustering will sometimes fail to give correct results, an unaware fact that may be overlooked by many researchers. This is especially the case when Dynamic Time Warping (DTW) is used as the distance measure in averaging the shape of time series. We also will demonstrate that the current averaging algorithm may not produce the real average of the time series, thus generates incorrect k-means clustering results, and then show potential causes why DTW averaging methods may not achieve meaningful clustering results. Lastly, we conclude with a suggestion of a method to potentially find the shape-based time series average that satisfies the required properties.
international conference on electrical engineering/electronics, computer, telecommunications and information technology | 2009
Vit Niennattrakul; Chotirat Ann Ratanamahatana
Dynamic Time Warping (DTW) distance measure has increasingly been used as a similarity measurement for various data mining tasks in place of traditional Euclidean distance metric due to its superiority in sequence-alignment flexibility. However, in some tasks where shape averaging is required, e.g., in template matching and k-means clustering problems, current averaging methods are inaccurate in that they produce undesired templates and cluster representatives. In this work, we emphasize the importance of the correctness of this averaging subroutine and propose a novel shape averaging method, called Prioritized Shape Averaging (PSA), using hierarchical clustering approach. In experimental evaluation, our proposed method, PSA, achieves a lower discrepancy distance between an averaged sequence and every original sequence than existing method on various domains.
international conference on conceptual structures | 2007
Vit Niennattrakul; Chotirat Ann Ratanamahatana
Shape averaging or signal averaging of time series data is one of the prevalent subroutines in data mining tasks, where Dynamic Time Warping distance measure (DTW) is known to work exceptionally well with these time series data, and has long been demonstrated in various data mining tasks involving shape similarity among various domains. Therefore, DTW has been used to find the averageshape of two time series according to the optimal mapping between them. Several methods have been proposed, some of which require the number of time series being averaged to be a power of two. In this work, we will demonstrate that these proposed methods cannot produce the realaverage of the time series. We conclude with a suggestion of a method to potentially find the shape-based time series average.
Knowledge Based Systems | 2012
Vit Niennattrakul; Dararat Srisai; Chotirat Ann Ratanamahatana
Dynamic time warping (DTW) distance has been proven to be one of the most accurate distance measures for time series classification. However, its calculation complexity is its own major drawback, especially when a massive training database has to be searched. Although many techniques have been proposed to speed up the search including indexing structures and lower bounding functions, for large databases, it is still untenable to embed the algorithm and search through the entire database of a system with limited resources, e.g., tiny sensors, within a given time. Therefore, a template matching is a solution to efficiently reduce storage and computation requirements; in other words, only a few time series sequences have to be retrieved and compared with an incoming query data. In this work, we propose a novel template matching framework with the use of DTW distance, where a shape-based averaging algorithm is utilized to construct meaningful templates. Our proposed framework demonstrates its utilities, where classification time speedup is in orders of magnitude, while maintaining good accuracy to rival methods.
digital image computing: techniques and applications | 2012
Supawadee Saengsri; Vit Niennattrakul; Chotirat Ann Ratanamahatana
Thai Sign Language has been a research priority since most people do not understand sign language, making it almost impossible to have daily-life communication with people who are deaf or mute. Past research works in Thai Sign Language Recognition which employs image processing techniques still do not perform well due to its limitation in similar hand image extraction of key features. To alleviate the problems and to improve its performance, this paper proposes Thai sign language recognition system using data gloves and a motion tracker device. Our focus is primarily alphabetic finger-spelling of Thai sign language by recognizing single-gesture hand shapes. Data segmentation and Neural Network techniques are utilized to improve the accuracy of the system.
international conference on electrical engineering electronics computer telecommunications and information technology | 2011
Pawan Nunthanid; Vit Niennattrakul; Chotirat Ann Ratanamahatana
One significant task in time series mining research area is motif discovery which is the first step needed to be done in finding interesting patterns in time series sequence. Recently, many motif discovery algorithms have been proposed in place of the untenable brute-force algorithm, to improve its time complexity. However, those motif discovery algorithms still need a predefined sliding window length that must be known a priori. In this paper, we present a novel motif discovery algorithm that requires no window length parameter. This sliding window length is sensitive in that a small difference in the value can lead to huge difference of motif results. The proposed algorithm automatically returns suitable motif lengths from all possible sliding window lengths; in other words, our algorithm efficiently reduces a large set of possibilities of the sliding window lengths down to a few truly-interesting variable-length motifs.
international conference on data mining | 2010
Vit Niennattrakul; Eamonn J. Keogh; Chotirat Ann Ratanamahatana
The problem of finding outliers in data has broad applications in areas as diverse as data cleaning, fraud detection, network monitoring, invasive species monitoring, etc. While there are dozens of techniques that have been proposed to solve this problem for static data collections, very simple distance-based outlier detection methods are known to be competitive or superior to more complex methods. However, distance-based methods have time and space complexities that make them impractical for streaming data and/or resource limited sensors. In this work, we show that simple data-editing techniques can make distance-based outlier detection practical for very fast streams and resource limited sensors. Our technique generalizes to produce two algorithms, which, relative to the original algorithm, can guarantee to produce no false positives, or guarantee to produce no false negatives. Our methods are independent of both data type and distance measure, and are thus broadly applicable.
knowledge discovery and data mining | 2012
Warissara Meesrikamolkul; Vit Niennattrakul; Chotirat Ann Ratanamahatana
One of the most famous algorithms for time series data clustering is k -means clustering with Euclidean distance as a similarity measure. However, many recent works have shown that Dynamic Time Warping (DTW) distance measure is more suitable for most time series data mining tasks due to its much improved alignment based on shape. Unfortunately, k -means clustering with DTW distance is still not practical since the current averaging functions fail to preserve characteristics of time series data within the cluster. Recently, Shape-based Template Matching Framework (STMF) has been proposed to discover a cluster representative of time series data. However, STMF is very computationally expensive. In this paper, we propose a Shape-based Clustering for Time Series (SCTS) using a novel averaging method called Ranking Shape-based Template Matching Framework (RSTMF), which can average a group of time series effectively but take as much as 400 times less computational time than that of STMF. In addition, our method outperforms other well-known clustering techniques in terms of accuracy and criterion based on known ground truth.
Data Mining and Knowledge Discovery | 2010
Vit Niennattrakul; Pongsakorn Ruengronghirunya; Chotirat Ann Ratanamahatana
Among many existing distance measures for time series data, Dynamic Time Warping (DTW) distance has been recognized as one of the most accurate and suitable distance measures due to its flexibility in sequence alignment. However, DTW distance calculation is computationally intensive. Especially in very large time series databases, sequential scan through the entire database is definitely impractical, even with random access that exploits some index structures since high dimensionality of time series data incurs extremely high I/O cost. More specifically, a sequential structure consumes high CPU but low I/O costs, while an index structure requires low CPU but high I/O costs. In this work, we therefore propose a novel indexed sequential structure called TWIST (Time Warping in Indexed Sequential sTructure) which benefits from both sequential access and index structure. When a query sequence is issued, TWIST calculates lower bounding distances between a group of candidate sequences and the query sequence, and then identifies the data access order in advance, hence reducing a great number of both sequential and random accesses. Impressively, our indexed sequential structure achieves significant speedup in a querying process. In addition, our method shows superiority over existing rival methods in terms of query processing time, number of page accesses, and storage requirement with no false dismissal guaranteed.