Haofeng Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haofeng Zhou is active.

Explore More

Publication

Featured researches published by Haofeng Zhou.

pacific-asia conference on knowledge discovery and data mining | 2004

Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining

Chen Wang; Mingsheng Hong; Jian Pei; Haofeng Zhou; Wei Wang; Baile Shi

Mining frequent tree patterns is an important research problems with broad applications in bioinformatics, digital library, e-commerce, and so on. Previous studies highly suggested that pattern-growth methods are efficient in frequent pattern mining. In this paper, we systematically develop the pattern growth methods for mining frequent tree patterns. Two algorithms, Chopper and XSpanner, are devised. An extensive performance study shows that the two newly developed algorithms outperform TreeMinerV [13], one of the fastest methods proposed before, in mining large databases. Furthermore, algorithm XSpanner is substantially faster than Chopper in many cases.

data warehousing and knowledge discovery | 2005

FMC: an approach for privacy preserving OLAP

Ming Hua; Shouzhi Zhang; Wei Wang; Haofeng Zhou; Baile Shi

To preserve private information while providing thorough analysis is one of the significant issues in OLAP systems. One of the challenges in it is to prevent inferring the sensitive value through the more aggregated non-sensitive data. This paper presents a novel algorithm FMC to eliminate the inference problem by hiding additional data besides the sensitive information itself, and proves that this additional information is both necessary and sufficient. Thus, this approach could provide as much information as possible for users, as well as preserve the security. The strategy does not impact on the online performance of the OLAP system. Systematic analysis and experimental comparison are provided to show the effectiveness and feasibility of FMC.

computer and information technology | 2005

Incorporating with Recursive Model Training in Time Series Clustering

Jiangjiao Duan; Wei Wang; Bing Liu; Yongsheng Xue; Haofeng Zhou; Baile Shi

Model-based clustering is one of the most important ways for time series data mining. However, the process of clustering may encounter several problems. In this paper, a novel clustering algorithm of time-series which incorporates recursive hidden Markov model(HMM) training is proposed. Our contributions are as follows: 1) We recursively train models and use these model information in the process agglomerative hierarchical clustering. 2) We built HMM of time-series clusters to describe clusters. To evaluate the effectiveness of the algorithm, several experiments are conducted on both synthetic data and real world data. The result shows that the proposed approach can achieve better performance in correctness rate than the traditional HMM-based clustering algorithm

knowledge discovery and data mining | 2003

An efficient algorithm of frequent connected subgraph extraction

Mingsheng Hong; Haofeng Zhou; Wei Wang; Baile Shi

Mining frequent patterns from datasets is one of the key success stories of data mining research. Currently, most of the works focus on independent data, such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to extract frequent patterns from these relations is the objective in this paper. We use graphs to model the relations, and select a simple type for analysis. Combining the graph theory and algorithms to generate frequent patterns, a new algorithm Topology, which can mine these graphs efficiently, has been proposed. We evaluate the performance of the algorithm by doing experiments with synthetic datasets and real data. The experimental results show that Topology can do the job well. At the end of this paper, the potential improvement is mentioned.

web age information management | 2008

REC: A Novel Model to Rank Experts in Communities

Chen Lin; Haofeng Zhou; Zhenhua Huang; Wei Wang

It is an important issue to get support from experts in our daily life. Expert finding is challenging. In previous commercial and academic systems, the users may not get what they expect. In this contribution, we address the problem of finding experts in communities. A novel model REC is presented to solve the expert finding problem in dynamic environment. The model ranks experts by textural and social information. Starting with the most familiar communities, the expert seeker may find appropriate experts, by considering both their local rankings in each community and the difficulty to get their help. Experiments are done on real data sets, including DBLP data set and W3C corpora. Compared with other existing methods, REC achieves promising results. It demonstrates the models competencies in various search applications.

web age information management | 2006

An effective approach for hiding sensitive knowledge in data publishing

Zhihui Wang; Bing Liu; Wei Wang; Haofeng Zhou; Baile Shi

Recent efforts have been made to address the problem of privacy preservation in data publishing. However, they mainly focus on preserving data privacy. In this paper, we address another aspect of privacy preservation in data publishing, where some of the knowledge implied by a dataset are regarded as private or sensitive information. In particular, we consider that the data are stored in a transaction database, and the knowledge is represented in the form of patterns. We present a data sanitization algorithm, called SanDB, for effectively protecting a set of sensitive patterns, meanwhile attempting to minimize the impact of data sanitization on the non-sensitive patterns. The experimental results show that SanDB can achieve significant improvement over the best approach presented in the literature.

web age information management | 2000

Mining Association Rules with Negative Items Using Interest Measure

Haofeng Zhou; Pan Gao; Yangyong Zhu

In this paper, we analyze some potential problems in the existing mining algorithms on association rules. These problems are caused by only concerning about its support and confidence, while neglecting to what extent the rule will interest people. At the same time, the existing definition and mining algorithms of association rules does not take into account any negative items, therefore many valuable rules are lost. We hereby introduce the concepts of interest measure and negative item into the definition and evaluation system. Then we modify the existing algorithms so as to use interest measure to generate rules with negative items. At the end of this paper we analyze the new algorithm and prove it to be efficient and feasible.

international database engineering and applications symposium | 2003

Refining Web authoritative resource by frequent structures

Haofeng Zhou; Yubo Lou; Qingqing Yuan; Wilfred Ng; Wei Wang; Baile Shi

The Web resource is a rich collection of the dynamic information, which is useful in various disciplines. There has also been much research work related to improving the quality of information searching in the Web. However, most of the work is still inadequate to satisfy a diversified demand from users. In this paper, we exploit the hyperlinks in the Web and propose a new approach called SFP in order to improve the quality of research results obtain from search engines. The SFP algorithm evolves from the frequent pattern mining technique, which is a common data mining technique for conventional databases. The essential idea of our approach is to mine the frequent structures of links from a given Web topology. By using the SFP algorithm, we extract the authoritative pages and communities from the complex Web topology. We demonstrate our approach by running several experiments and show that the performance and functionalities of using the SFP in managing search results are better than other known methods such as HITS.

Journal of Computer Science and Technology | 2003

PHC: a fast partition and hierarchy-based clustering algorithm

Haofeng Zhou; Qingqing Yuan; Zunping Cheng; Baile Shi

Cluster analysis is a process to classify data in a specified data set. In this field, much attention is paid to high-efficiency clustering algorithms. In this paper, the features in the current partition-based and hierarchy-based algorithms are reviewed, and a new hierarchy-based algorithm PHC is proposed by combining advantages of both algorithms, which uses the cohesion and the closeness to amalgamate the clusters. Compared with similar algorithms, the performance of PHC is improved, and the quality of clustering is guaranteed. And both the features were proved by the theoretic and experimental analyses in the paper.

web age information management | 2001

ARMiner: A Data Mining Tool Based on Association Rules

Haofeng Zhou; Beijun Ruan; Jianqiu Zhu; Yangyong Zhu; Baile Shi

In this paper, ARMiner, a data mining tools based on association rules, is introduced. Beginning with the system architecture, the characteristic and the function are displayed in details, including data transfer, concept hierarchy generalization, mining rules with negative items and the re-development of the system. We also show an example of the tools application in this paper. Finally, some expectations for future work are presented.

Explore More