Edmond HaoCun Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Edmond HaoCun Wu is active.

Explore More

Publication

Featured researches published by Edmond HaoCun Wu.

advanced data mining and applications | 2005

Independent component analysis for clustering multivariate time series data

Edmond HaoCun Wu; Philip L. H. Yu

Independent Component Analysis (ICA) is a useful statistical method for separating mixed data sources into statistically independent patterns. In this paper, we apply ICA to transform multivariate time series data into independent components (ICs), and then propose a clustering algorithm called ICACLUS to group underlying data series according to the ICs found. This clustering algorithm can be used to identify stocks with similar stock price movement. The experiments show that this method is effective and efficient, which also outperforms other comparable clustering methods, such as K-means.

International Journal of Neural Systems | 2006

VALUE AT RISK ESTIMATION USING INDEPENDENT COMPONENT ANALYSIS-GENERALIZED AUTOREGRESSIVE CONDITIONAL HETEROSCEDASTICITY (ICA-GARCH) MODELS

Edmond HaoCun Wu; Philip L. H. Yu; Wai Keung Li

We suggest using independent component analysis (ICA) to decompose multivariate time series into statistically independent time series. Then, we propose to use ICA-GARCH models which are computationally efficient to estimate the multivariate volatilities. The experimental results show that the ICA-GARCH models are more effective than existing methods, including DCC, PCA-GARCH, and EWMA. We also apply the proposed models to compute value at risk (VaR) for risk management applications. The backtesting and the out-of-sample tests validate the performance of ICA-GARCH models for value at risk estimation.

intelligent data engineering and automated learning | 2005

Volatility modelling of multivariate financial time series by using ICA-GARCH models

Edmond HaoCun Wu; Philip L. H. Yu

Volatility modelling of asset returns is an important aspect for many financial applications, e.g., option pricing and risk management. GARCH models are usually used to model the volatility processes of financial time series. However, multivariate GARCH modelling of volatilities is still a challenge due to the complexity of parameters estimation. To solve this problem, we suggest using Independent Component Analysis (ICA) for transforming the multivariate time series into statistically independent time series. Then, we propose the ICA-GARCH model which is computationally efficient to estimate the volatilities. The experimental results show that this method is more effective to model multivariate time series than existing methods, e.g., PCA-GARCH.

Journal of Travel & Tourism Marketing | 2010

Data Mining For Hotel Occupancy Rate: An Independent Component Analysis Approach

Edmond HaoCun Wu; Rob Law; Brianda Jiang

ABSTRACT The recent global financial crisis and the threat of a worldwide H1N1 influenza epidemic have greatly affected the tourism and hospitality industries around the world. Both hospitality practitioners and researchers are interested in finding analytical methods that enable forecasts to be made of hotel room demand under the uncertain conditions likely to affect the industry. In this article, a novel data mining technique called independent component analysis (ICA) is proposed to establish the major factors determining the hotel occupancy rate in Hong Kong. Then, extension of the model is suggested, incorporating these factors to decompose hotel occupancy rates and examine the effect of each factor on the hotel occupancy rate. Empirical findings show that outbreaks of infectious diseases, economic performance, and service price were the major determinants of the hotel occupancy rate in Hong Kong over the period studied.

Computational Statistics & Data Analysis | 2009

A smoothed bootstrap test for independence based on mutual information

Edmond HaoCun Wu; Philip L. H. Yu; Wai Keung Li

A test for independence of multivariate time series based on the mutual information measure is proposed. First of all, a test for independence between two variables based on i.i.d. (time-independent) data is constructed and is then extended to incorporate higher dimensions and strictly stationary time series data. The smoothed bootstrap method is used to estimate the null distribution of mutual information. The experimental results reveal that the proposed smoothed bootstrap test performs better than the existing tests and can achieve high powers even for moderate dependence structures. Finally, the proposed test is applied to assess the actual independence of components obtained from independent component analysis (ICA).

international conference on independent component analysis and signal separation | 2006

An independent component ordering and selection procedure based on the MSE criterion

Edmond HaoCun Wu; Philip L. H. Yu; Wai Keung Li

Principal components (PCs) by construction have a natural ordering based on their cumulative proportion of variance explained. However, most ICA algorithms for finding independent components (ICs) are arbitrary, which limit the use of ICA in pattern discovery and dimension reduction. To solve this problem, we propose an efficient IC ordering approach and prove that this method guarantees to find the optimal ordering of ICs based on the MSE criterion. Furthermore, we employ the cross validation method to select the number of dominant ICs. Simulation experiments show that the proposed IC ordering and selection procedure is efficient and effective, which can be used to identify the dominant ICs as well as to reduce the number of ICs.

database systems for advanced applications | 2004

On Improving Website Connectivity by Using Web-Log Data Streams

Edmond HaoCun Wu; Michael K. Ng; Joshua Zhexue Huang

When people visit Websites, they desire to efficiently and exactly access the contents they are interested in without delay. However, due to the constant changes of site contents and user patterns, the access efficiency of Websites cannot be optimized, especially in peak hours. In this paper, we first address the problems of access efficiency in Websites during peak hours and then propose new measures to evaluate access efficiency. An efficient algorithm is introduced to detect user access patterns using Website topology and Web-log stream data. Adopting this method, we can online modify a Website topology so that the new topology can improve the Website connectivity to adapt current visitors’ access patterns. A real sports Website is used to evaluate the effectiveness of our proposed method of accelerating user access to related contents. The results of the evaluation presented in this paper suggest that this method is feasible to online improve the connectivity of a Website intelligently.

pacific-asia conference on knowledge discovery and data mining | 2004

An efficient algorithm for dense regions discovery from large-Scale data streams

Andy M. Yip; Edmond HaoCun Wu; Michael K. Ng; Tony F. Chan

We introduce the notion of dense region as distinct and meaningful patterns from given data. Efficient and effective algorithms for identifying such regions are presented. Next, we discuss extensions of the algorithms for handling data streams. Finally, experiments on large-scale data streams such as clickstreams are given which confirm that the usefulness of our algorithms.

knowledge discovery and data mining | 2003

A graph-based optimization algorithm for website topology using interesting association rules

Edmond HaoCun Wu; Michael K. Ng

The Web serves as a global information service center that contains vast amount of data. The Website structure should be designed effectively so that users can efficiently find their information. The main contribution of this paper is to propose a graph-based optimization algorithm to modify Website topology using interesting association rules. The interestingness of an association rule A ⇒ B is defined based on the probability measure between two sets of Web pages A and B in the Website. If the probability measure between A and B is low (high), then the association rule A ⇒ B has high (low) interest. The hyperlinks in the Website can be modified to adapt user access patterns according to association rules with high interest. We present experimental results and demonstrate that our method is effective.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2007

Strategies for Identifying Statistically Significant Dense Regions in Microarray Data

Andy M. Yip; Michael K. Ng; Edmond HaoCun Wu; Tony F. Chan

We propose and study the notion of dense regions for the analysis of categorized gene expression data and present some searching algorithms for discovering them. The algorithms can be applied to any categorical data matrices derived from gene expression level matrices. We demonstrate that dense regions are simple but useful and statistically significant patterns that can be used to 1) identify genes and/or samples of interest and 2) eliminate genes and/or samples corresponding to outliers, noise, or abnormalities. Some theoretical studies on the properties of the dense regions are presented which allow us to characterize dense regions into several classes and to derive tailor-made algorithms for different classes of regions. Moreover, an empirical simulation study on the distribution of the size of dense regions is carried out which is then used to assess the significance of dense regions and to derive effective pruning methods to speed up the searching algorithms. Real microarray data sets are employed to test our methods. Comparisons with six other well-known clustering algorithms using synthetic and real data are also conducted which confirm the superiority of our methods in discovering dense regions. The DRIFT code and a tutorial are available as supplemental material, which can be found on the Computer Society Digital Library at http://computer.org/tcbb/archives.htm.

Explore More