Li-Jen Kao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Li-Jen Kao is active.

Explore More

Publication

Featured researches published by Li-Jen Kao.

computer and information technology | 2008

Identifying insightful salinity and temperature variations in ocean data

Yo-Ping Huang; Li-Jen Kao; Frode Eika Sandnes

Global ocean salinity and temperature variations are attracting increasing attention as scientists are trying to extract more knowledge from collected ocean data to better understand global change. Association rules mining can be applied to ocean salinity and temperature data to discover spatial-temporal patterns that reveal salinity and temperature variations. Since we are addressing the associations of salinity/temperature events among different time and locations, the events can be grouped into clusters before mining starts, and the discovered association rule that has its antecedent and consequent from different clusters will be of most interest. However, are the discovered rules important or insightful? In this paper, an importance measurement for association rules with antecedent and consequent from different clusters is proposed. The importance measurement quantifies the rulepsilas antecedent and consequent impact to their clusters, respectively. A rule is insightful if its importance measurement is above a predefined threshold. The insightful rules can then be presented to experts for further analysis.

systems, man and cybernetics | 2012

Association rules based algorithm for identifying outlier transactions in data stream

Li-Jen Kao; Yo-Ping Huang

Most outlier detection algorithms are proposed to discover outlier patterns from static databases. Those algorithms are infeasible for instant identification of outlier patterns in data streams that continuously arriving and unbounded data serve as the data sources in many applications such as sensor data feeding. In this paper an association rules based method is proposed to find outlier patterns in data streams. The presented work segments transactions from data streams and then finds approximate frequent itemsets with single data scan instead of requiring multiple scans. Based on the derived association rules some transaction can be identified as outliers if their outlier degrees are higher than a predefined threshold. The proposed method not only just finds the outlier patterns but also identifies the most possible items that induce the abnormal transactions in the data streams. Efficiency comparisons with frequent itemsets-based work are also done to verify the effectiveness of the proposed framework.

systems, man and cybernetics | 2007

Data mining and fuzzy inference based salinity and temperature variation prediction

Yo-Ping Huang; Li-Jen Kao; Frode Eika Sandnes

The ARGO project archives huge quantities of upper ocean salinity/temperature time series measurements that are related to climate issues such as global warming. Fuzzy inter-transaction association rules are derived from ARGO data using a reduced prefix-projected item set algorithm that has a small space and time complexity. After mining the frequent 1-itemsets the proposed algorithm exploits a reduced prefix projection strategy to extract the frequent inter-itemsets. Based on the extracted fuzzy inter-transaction association rules a fuzzy inference model is proposed for identifying salinity/temperature anomalies. Experimental results verify that the proposed model is effective in predicting the occurrence of abnormal salinity/temperature variations.

IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04. | 2004

Using fuzzy support and confidence setting to mine interesting association rules

Yo-Ping Huang; Li-Jen Kao

This paper proposes a fuzzy model to derive the appropriate minimum support and confidence thresholds for mining the association rules. The traditional data mining technologies of association rules usually base on user-defined minimum support and confidence values. The most important problem is how to select the appropriate minimum support and confidence to find frequent itemsets. A priori algorithm, the widely adopted approach, exploits the following property to derive the frequent itemset: if an itemset is frequent, so are all its subsets. That is, apriori algorithm generates itemsets in a level-wise manner where each candidate in the jth iteration is generated from previous frequent (j-1)-itemsets. A generated candidate can be further pruned if any subset of size j-1 is not a frequent itemset. A priori algorithm relies on the essential assumption that all itemsets have a uniform minimum support value, i.e., we assume that all items in the dataset have the same nature, e.g., all the items have the same sale price or the same salability condition in different time intervals or locations. However, the assumption may not comply with the rules embedded in the large databases. Concept hierarchy is helpful to solve the problem and the support and confidence thresholds should vary especially while we consider items at different conceptual abstractions. For example, turkey and pumpkin pie are seldom sold together. However, if we look at the transactions in the week before Thanksgiving, we may discover that most transactions contain turkey and pumpkin pie. It means that we should apply different support values to different time intervals. In this paper, we present a framework of multilevel association rules mining in the presence of fuzzy concept hierarchies that would derive a reasonable minimum support and confidence setting without losing potential interesting rules.

systems, man and cybernetics | 2013

Ejecting Outliers to Enhance Robustness of Fuzzy Cluster Ensemble

Li-Jen Kao; Yo-Ping Huang

Clustering analysis provides significant contributions to healthcare or medical service. However, relying only on one set of clusters obtained from employing a clustering algorithm, such as fuzzy c-means algorithm (FCM), with an arbitrary initialization may be not robust and accurate in data clustering. The cluster ensemble, the concept of combining multiple clusters produced by a cluster algorithm with several different initializations, can improve the robustness problem. When the outliers were taken into the ensemble may lead the final cluster ensemble to inaccurate results. Thus, outliers should be removed before merging different clusters. In this paper, an adapted FCM algorithm is proposed to detect and remove the outliers. The cluster ensemble framework will employ this adapted FCM algorithm to generate multiple sets of clusters by giving different initialization parameters. Then, a pair wise approach is used to combine those outlier-free clusters. The experimental results verify that the final clusters obtained from the proposed cluster ensemble framework are more robust.

systems, man and cybernetics | 2011

An efficient strategy to detect outlier transactions for knowledge mining

Li-Jen Kao; Yo-Ping Huang

Instant identification of outlier patterns is very important in modern-day engineering problems such as credit card fraud detection and network intrusion detection. Most previous studies focused on finding outliers that are hidden in numerical datasets. Unfortunately, those outlier detection methods were not directly applicable to real life transaction databases. Although a limited literature presented methods to find outliers in the transaction datasets, they did not address what really caused the transactions to become abnormal. In this paper, an improved framework is proposed to identify the outlier transactions as well as to find the most possible items that induce the abnormal transactions. Several definitions are defined as prerequisite for outlier detection. Efficiency comparisons with previous work are also done to verify the effectiveness of the proposed framework.

systems, man and cybernetics | 2010

Fuzzy environment mapping for robot navigation based on grid computing

Li-Jen Kao; Yo-Ping Huang; Frode Eika Sandnes; Mann-Jung Hsiao

In order to navigate autonomously, a mobile robot needs to build an environment map where the robot is navigating. Currently, the sensors are mounted on the robot to detect if the obstacles exist and then the map immediate surrounding of the robot is built to help for navigation path planning. The map created by this method is a local map that may cause global navigation problem which a global coverage map is needed to solve such a problem. In this study, a sensor network is deployed for building global environment map. All the sensor locations are assumed known. The navigation space is divided into grids and a grid is to be detected if obstacles exist by one or a number of sensors. Fuzzy set concept is used to introduce a tool useful for sensor perception. Those sensors work as a team to explore all the space and then the global fuzzy map is constructed. The experiments show that the fuzzy map is more practical and helps the path planning problem to be solved more efficiently.

Applied Intelligence | 2015

Associating absent frequent itemsets with infrequent items to identify abnormal transactions

Li-Jen Kao; Yo-Ping Huang; Frode Eika Sandnes

Data stored in transactional databases are vulnerable to noise and outliers and are often discarded at the early stage of data mining. Abnormal transactions in the marketing transactional database are those transactions that should contain some items but do not. However, some abnormal transactions may provide valuable information in the knowledge mining process. The literature on how to efficiently identify abnormal transactions in the database as well as determine what causes the transactions to be abnormal is scarce. This paper proposes a framework to realize abnormal transactions as well as the items that induce the abnormal transactions. Results from one synthetic and two medical data sets are presented to compare with previous work to verify the effectiveness of the proposed framework.

systems, man and cybernetics | 2009

Extracting spatial semantics in association rules for ocean image retrieval

Li-Jen Kao; Yo-Ping Huang; Frode Eika Sandnes

Several research institutions and governmental departments provide ocean images for research purposes. For example, Argo, a worldwide ocean research organization, produces ocean salinity and temperature images and researchers can download those images from the Internet. One may build an image system to store ocean images and retrieve them later for further research, for example, to predict future salinity or temperature variation. Image retrieval technology is therefore important. This paper describes an ocean image retrieval system based on content-based image retrieval. Currently, content-based image retrieval technology does not exploit high-level semantics, and it is hard to obtain predictive information from retrieved images. Our improvement involves a spatial reference method that is used to help get the spatial relationships between objects for a certain image. This allows the spatial semantics between the query image and images in database to be considered. Spatial association rules are also mined and are subsequently used as a basis for retrieving additional images. As the spatial semantics in both the query image and spatial association rules, the retrieved images are more accurate. The experimental results verify that the system effectively predicts the occurrence of salinity or temperature variations.

systems, man and cybernetics | 2008

Discriminating important ocean salinity and temperature patterns in argo data

Yo-Ping Huang; Li-Jen Kao; Frode Eika Sandnes

Ocean salinity and temperature variations have been observed for decades to clarify their effect to global climate changes. Data mining techniques are effective in extracting implicit and useful information from large databases. Discovering salinity and temperature variation patterns from Argo ocean data will in turn help reveal the spatio-temporal relationship between salinity and temperature variations. However, some of the discovered patterns are trivial because they are already known to the oceanographer. In this study, the water mass (a water body with the same salinity and temperature), the mined salinity and temperature patterns and an entropy importance measure are combined to discriminate important patterns from trivial patterns. This study measures both the patterns with variations in both antecedent and consequent parts that belong to separate clusters, and that belong to the same cluster. A pattern is classified as important if its importance measure exceeds a predefined threshold. The important patterns are transformed into fuzzy rules in a fuzzy inference model to obtain more accurate salinity and temperature variation predictions. Simulation results verify the effectiveness of the proposed model.

Explore More