Is this you? Create Your Porfile

Suh-Yin Lee

National Chiao Tung University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Suh-Yin Lee is active.

Explore More

Publication

Featured researches published by Suh-Yin Lee.

Expert Systems With Applications | 2009

Mining frequent itemsets over data streams using efficient window sliding techniques

Hua-Fu Li; Suh-Yin Lee

Online mining of frequent itemsets over a stream sliding window is one of the most important problems in stream data mining with broad applications. It is also a difficult issue since the streaming data possess some challenging characteristics, such as unknown or unbound size, possibly a very fast arrival rate, inability to backtrack over previously arrived transactions, and a lack of system control over the order in which the data arrive. In this paper, we propose an effective bit-sequence based, one-pass algorithm, called MFI-TransSW (Mining Frequent Itemsets within a Transaction-sensitive Sliding Window), to mine the set of frequent itemsets from data streams within a transaction-sensitive sliding window which consists of a fixed number of transactions. The proposed MFI-TransSW algorithm consists of three phases: window initialization, window sliding and pattern generation. First, every item of each transaction is encoded in an effective bit-sequence representation in the window initialization phase. The proposed bit-sequence representation of item is used to reduce the time and memory needed to slide the windows in the following phases. Second, MFI-TransSW uses the left bit-shift technique to slide the windows efficiently in the window sliding phase. Finally, the complete set of frequent itemsets within the current sliding window is generated by a level-wise method in the pattern generation phase. Experimental studies show that the proposed algorithm not only attain highly accurate mining results, but also run significant faster and consume less memory than do existing algorithms for mining frequent itemsets over data streams with a sliding window. Furthermore, based on the MFI-TransSW framework, an extended single-pass algorithm, called MFI-TimeSW (Mining Frequent Itemsets within a Time-sensitive Sliding Window) is presented to mine the set of frequent itemsets efficiently over time-sensitive sliding windows.

Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks | 2006

Human action recognition using star skeleton

Hsuan-Sheng Chen; Hua-Tsung Chen; Yi-Wen Chen; Suh-Yin Lee

This paper presents a HMM-based methodology for action recogni-tion using star skeleton as a representative descriptor of human posture. Star skeleton is a fast skeletonization technique by connecting from centroid of target object to contour extremes. To use star skeleton as feature for action recognition, we clearly define the fea-ture as a five-dimensional vector in star fashion because the head and four limbs are usually local extremes of human shape. In our proposed method, an action is composed of a series of star skeletons over time. Therefore, time-sequential images expressing human action are transformed into a feature vector sequence. Then the fea-ture vector sequence must be transformed into symbol sequence so that HMM can model the action. We design a posture codebook, which contains representative star skeletons of each action type and define a star distance to measure the similarity between feature vec-tors. Each feature vector of the sequence is matched against the codebook and is assigned to the symbol that is most similar. Conse-quently, the time-sequential images are converted to a symbol posture sequence. We use HMMs to model each action types to be recognized. In the training phase, the model parameters of the HMM of each category are optimized so as to best describe the training symbol sequences. For human action recognition, the model which best matches the observed symbol sequence is selected as the recog-nized category. We implement a system to automatically recognize ten different types of actions, and the system has been tested on real human action videos in two cases. One case is the classification of 100 video clips, each containing a single action type. A 98% recog-nition rate is obtained. The other case is a more realistic situation in which human takes a series of actions combined. An action-series recognition is achieved by referring a period of posture history using a sliding window scheme. The experimental results show promising performance.

international workshop on research issues in data engineering | 2005

Online mining (recently) maximal frequent itemsets over data streams

Hua-Fu Li; Suh-Yin Lee; Man-Kwan Shan

A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams.

Journal of Visual Communication and Image Representation | 2009

Physics-based ball tracking and 3D trajectory reconstruction with applications to shooting location estimation in basketball video

Hua-Tsung Chen; Ming-Chun Tien; Yi-Wen Chen; Wen-Jiin Tsai; Suh-Yin Lee

The demand for computer-assisted game study in sports is growing dramatically. This paper presents a practical video analysis system to facilitate semantic content understanding. A physics-based algorithm is designed for ball tracking and 3D trajectory reconstruction in basketball videos and shooting location statistics can be obtained. The 2D-to-3D inference is intrinsically a challenging problem due to the loss of 3D information in projection to 2D frames. One significant contribution of the proposed system lies in the integrated scheme incorporating domain knowledge and physical characteristics of ball motion into object tracking to overcome the problem of 2D-to-3D inference. With the 2D trajectory extracted and the camera parameters calibrated, physical characteristics of ball motion are involved to reconstruct the 3D trajectories and estimate the shooting locations. Our experiments on broadcast basketball videos show promising results. We believe the proposed system will greatly assist intelligence collection and statistics analysis in basketball games.

systems man and cybernetics | 1997

On-line signature verification using LPC cepstrum and neural networks

Quen-Zong Wu; I-Chang Jou; Suh-Yin Lee

An on-line signature verification scheme based on linear prediction coding (LPC) cepstrum and neural networks is proposed. Cepstral coefficients derived from linear predictor coefficients of the writing trajectories are calculated as the features of the signatures. These coefficients are used as inputs to the neural networks. A number of single-output multilayer perceptrons (MLPs), as many as the number of words in the signature, are equipped for each registered person to verify the input signature. If the summation of output values of all MLPs is larger than the verification threshold, the input signature is regarded as a genuine signature; otherwise, the input signature is a forgery. Simulations show that this scheme can detect the genuineness of the input signatures from a test database with an error rate as low as 4%

Expert Systems With Applications | 2009

Incremental updates of closed frequent itemsets over continuous data streams

Hlia-Fu Li; Chin-Chuan Ho; Suh-Yin Lee

Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the time and memory needed to slide the windows. Experiments show that the proposed algorithm not only attain highly accurate mining results, but also run significant faster and consume less memory than existing algorithm Moment for mining closed frequent itemsets over recent data streams.

Pattern Recognition Letters | 1991

Picture algebra for spatial reasoning of iconic images represented in 2D C-string

Suh-Yin Lee; Fang-Jung Hsu

A new spatial knowledge representation 2D C-string with accompanied cutting mechanism and a set of spatial operators are proposed. 2D C-string is characteristic of spatial knowledge embedded in images and is efficient in representation and manipulation of images. In this paper, transitive laws, distributive laws and manipulation laws of picture algebra are presented. All the binary relationships among objects in an image can be derived from 2D C-string. The picture algebra provides the theoretic basis for spatial reasoning and pictorial query inference.

Knowledge and Information Systems | 2012

Efficient algorithms for influence maximization in social networks

Yi-Cheng Chen; Wen-Chih Peng; Suh-Yin Lee

In recent years, due to the surge in popularity of social-networking web sites, considerable interest has arisen regarding influence maximization in social networks. Given a social network structure, the problem of influence maximization is to determine a minimum set of nodes that could maximize the spread of influences. With a large-scale social network, the efficiency and practicability of such algorithms are critical. Although many recent studies have focused on the problem of influence maximization, these works in general are time-consuming when a social network is large-scale. In this paper, we propose two novel algorithms, CDH-Kcut and Community and Degree Heuristic on Kcut/SHRINK, to solve the influence maximization problem based on a realistic model. The algorithms utilize the community structure, which significantly decreases the number of candidates of influential nodes, to avoid information overlap. The experimental results on both synthetic and real datasets indicate that our algorithms not only significantly outperform the state-of-the-art algorithms in efficiency but also possess graceful scalability.

Journal of Information Science and Engineering | 2005

Fast discovery of sequential patterns through memory indexing and database partitioning

Ming-Yen Lin; Suh-Yin Lee

Sequential pattern mining is a challenging issue because of the high complexity of temporal pattern discovering from numerous sequences. Current mining approaches either require frequent database scanning or the generation of several intermediate databases. As databases may fit into the ever-increasing main memory, efficient memory-based discovery of sequential patterns is becoming possible. In this paper, we propose a memory indexing approach for fast sequential pattern mining, named MEMISP. During the whole process, MEMISP scans the sequence database only once to read data sequences into memory. The find-then-index technique is recursively used to find the items that constitute a frequent sequence and constructs a compact index set which indicates the set of data sequences for further exploration. As a result of effective index advancing, fewer and shorter data sequences need to be processed in MEMISP as the discovered patterns get longer. Moreover, we can estimate the maximum size of the total memory required, which is independent of the minimum support threshold, in MEMISP. Experimental results indicate that MEMISP outperforms both GSP and PrefixSpan (general version) without the need for either candidate generation or database projection. When the database is too large to fit into memory in a batch, we partition the database, mine patterns in each partition, and validate the true patterns in the second pass of database scanning. Experiments performed on extra-large databases demonstrate the good performance and scalability of MEMISP, even with very low minimum support. Therefore, MEMISP can efficiently mine sequence databases of any size, for any minimum support values.

Information Systems | 2004

Incremental update on sequential patterns in large databases by implicit merging and efficient counting

Ming-Yen Lin; Suh-Yin Lee

Current approaches for sequential pattern mining usually assume that the mining is performed in a static sequence database. However, databases are not static due to update so that the discovered patterns might become invalid and new patterns could be created. In addition to higher complexity, the maintenance of sequential patterns is more challenging than that of association rules owing to sequence merging. Sequence merging, which is unique in sequence databases, requires the appended new sequences to be merged with the existing ones if their customer ids are the same. Re-mining of the whole database appears to be inevitable since the information collected in previous discovery will be corrupted by sequence merging. Instead of re-mining, the proposed IncSP (Incremental Sequential Pattern Update) algorithm solves the maintenance problem through effective implicit merging and efficient separate counting over appended sequences. Patterns found previously are incrementally updated rather than re-mined from scratch. Moreover, the technique of early candidate pruning further speeds up the discovery of new patterns. Empirical evaluation using comprehensive synthetic data shows that IncSP is fast and scalable.

Explore More