Chihli Hung | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chihli Hung is active.

Explore More

Publication

Featured researches published by Chihli Hung.

IEEE Intelligent Systems | 2004

Hybrid neural document clustering using guided self-organization and WordNet

Chihli Hung; Stefan Wermter; Peter Smith

Document clustering is text processing that groups documents with similar concepts. Its usually considered an unsupervised learning approach because theres no teacher to guide the training process, and topical information is often assumed to be unavailable. A guided approach to document clustering that integrates linguistic top-down knowledge from WordNet into text vector representations based on the extended significance vector weighting technique improves both classification accuracy and average quantization error. In our guided self-organization approach we integrate topical and semantic information from WordNet. Because a document-training set with preclassified information implies relationships between a word and its preference class, we propose a novel document vector representation approach to extract these relationships for document clustering. Furthermore, merging statistical methods, competitive neural models, and semantic relationships from symbolic Word-Net, our hybrid learning approach is robust and scales up to a real-world task of clustering 100,000 news documents.

international conference on data mining | 2003

A dynamic adaptive self-organising hybrid model for text clustering

Chihli Hung; Stefan Wermter

Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distribution. For this task we propose a new competitive self-organising (SOM) model, namely the dynamic adaptive self-organising hybrid model (DASH). The features of DASH are a dynamic structure, hierarchical clustering, nonstationary data learning and parameter self-adjustment. All features are data-oriented: DASH adjusts its behaviour not only by modifying its parameters but also by an adaptive structure. The hierarchical growing architecture is a useful facility for such a competitive neural model which is designed for text clustering. We have presented a new type of self-organising dynamic growing neural network which can deal with the nonuniform data distribution and the nonstationary data sets and represent the inner data structure by a hierarchical view.

hybrid intelligent systems | 2004

Neural Network Based Document Clustering Using WordNet Ontologies

Chihli Hung; Stefan Wermter

Three novel text vector representation approaches for neural network based document clustering are proposed. The first is the extended significance vector model (ESVM), the second is the hypernym significance vector model (HSVM) and the last is the hybrid vector space model (HyM). ESVM extracts the relationship between words and their preferred classified labels. HSVM exploits a semantic relationship from the WordNet ontology. A more general term, the hypernym, substitutes for terms with similar concepts. This hypernym semantic relationship supplements the neural model in document clustering. HyM is a combination of a TFxIDF vector and a hypernym significance vector, which combines the advantages and reduces the disadvantages from both unsupervised and supervised vector representation approaches. According to our experiments, the self-organising map (SOM) model based on the HyM text vector representation approach is able to improve classification accuracy and to reduce the average quantization error (AQE) on 10,000 full-text articles.

international conference on computational linguistics | 2002

Selforganizing classification on the Reuters news corpus

Stefan Wermter; Chihli Hung

In this paper we propose an integration of a selforganizing map and semantic networks from WordNet for a text classification task using the new Reuters news corpus. This neural model is based on significance vectors and benefits from the presentation of document clusters. The Hypernym relation in WordNet supplements the neural model in classification. We also analyse the relationships of news headlines and their contents of the new Reuters corpus by a series of experiments. This hybrid approach of neural selforganization and symbolic hypernym relationships is successful to achieve good classification rates on 100,000 full-text news articles. These results demonstrate that this approach can scale up to a large real-world task and show a lot of potential for text classification.

international symposium on neural networks | 2005

A constructive and hierarchical self-organizing model in a non-stationary environment

Chihli Hung; Stefan Wermter

Several related self-organizing neural models have been proposed to enhance the flexibility of self-organizing maps. In our studies, these models depend on the pre-definition of several thresholds which are used as guidance of neural behaviors for specific data sets. However, it is not trivial to determine those thresholds in a non-stationary environment. When a proper threshold has been determined, this threshold may not be suitable for the future. Therefore, in this paper, we compare the dynamic adaptive self-organizing hybrid (DASH) model with the growing neural gas (GNG) model by introducing several different initial thresholds to test their feasibility. Our experiments show that the DASH model is more stable and practicable for document clustering in a non-stationary environment since DASH adjusts its behavior not only by modifying its parameters but also by an adaptive structure.

international symposium on neural networks | 2010

Semantic Subspace Learning with conditional significance vectors

Nandita Tripathi; Stefan Wermter; Chihli Hung; Michael P. Oakes

Subspace detection and processing is receiving more attention nowadays as a method to speed up search and reduce processing overload. Subspace Learning algorithms try to detect low dimensional subspaces in the data which minimize the intra-class separation while maximizing the inter-class separation. In this paper we present a novel technique using the maximum significance value to detect a semantic subspace. We further modify the document vector using conditional significance to represent the subspace. This enhances the distinction between classes within the subspace. We compare our method against TFIDF with PCA and show that it consistently outperforms the baseline with a large margin when tested with a wide variety of learning algorithms. Our results show that the combination of subspace detection and conditional significance vectors improves subspace learning.

International Conference on Innovative Techniques and Applications of Artificial Intelligence | 2004

A Self-Organising Hybrid Model for Dynamic Text Clustering

Chihli Hung; Stefan Wermter

A text clustering neural model, traditionally, is assumed to cluster static text information and represent its inner structure on a flat map. However, the quantity of text information is continuously growing and the relationships between them are usually complicated. Therefore, the information is not static and a flat map may be not enough to describe the relationships of input data. In this paper, for a real-world text clustering task we propose a new competitive Self-Organising Map (SOM) model, namely the Dynamic Adaptive Self-Organising Hybrid model (DASH). The features of DASH are a dynamic structure, hierarchical clustering, non-stationary data learning and parameter self-adjustment. All features are data-oriented: DASH adjusts its behaviour not only by modifying its parameters but also by an adaptive structure . We test the performance of our model using the larger new Reuters news corpus based on the criteria of classification accuracy and mean quantization error.

international symposium on neural networks | 2004

A time-based self-organising model for document clustering

Chihli Hung; Stefan Wermter

Most current approaches for document clustering do not consider the non-stationary feature of real world document collection. In this paper, in a non-stationary environment, we propose a new self-organising model, namely the dynamic adaptive self-organising hybrid (DASH) model. The DASH model runs continuously since the new document set is formed consecutively for training while the old document set is still at the training stage. Knowledge learned from the old data set is adjusted to reflect the new data set and therefore document clusters are up-to-date. We test the performance of our model using the Reuters-RCV1 news corpus and obtain promising results based on the criteria of classification accuracy and average quantization error.

european conference on information retrieval | 2004

Predictive Top-Down Knowledge Improves Neural Exploratory Bottom-Up Clustering

Chihli Hung; Stefan Wermter; Peter Smith

In this paper, we explore the hypothesis that integrating symbolic top-down knowledge into text vector representations can improve neural exploratory bottom-up representations for text clustering. By extracting semantic rules from WordNet, terms with similar concepts are substituted with a more general term, the hypernym. This hypernym semantic relationship supplements the neural model in document clustering. The neural model is based on the extended significance vector representation approach into which predictive top-down knowledge is embedded. When we examine our hypothesis by six competitive neural models, the results are consistent and demonstrate that our robust hybrid neural approach is able to improve classification accuracy and reduce the average quantization error on 100,000 full-text articles.

international symposium on neural networks | 2010

Integrated Time Series Forecasting approaches using moving average, grey prediction, support vector regression and bagging for NNGC

Chihli Hung; Xin-Yi Huang; Hao-Kai Lin; Yen-Hsu Hou

Time series prediction is an interesting and challenging task in the field of data mining. This paper focuses on the monthly time series in NNGC. There are two main kinds of approaches, i.e. statistical approaches and computational intelligence approaches, which deal with time series prediction. We treat moving average and grey prediction from the statistical field as our benchmarks. We then combine these two approaches respectively with support vector regression (SVR) from the computational intelligence field. The hybrid SVR approaches outperform moving average and grey prediction based on the criteria of MAPE, SMAPE and RMSE. Finally, we further integrate these hybrid SVR approaches with the technique of the bagging ensembles to further achieve a better performance.

Explore More