Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where William Zhu is active.

Publication


Featured researches published by William Zhu.


Information Sciences | 2003

Reduction and axiomization of covering generalized rough sets

William Zhu; Fei-Yue Wang

This paper investigates some basic properties of covering generalized rough sets, and their comparison with the corresponding ones of Pawlaks rough sets, a tool for data mining. The focus here is on the concepts and conditions for two coverings to generate the same covering lower approximation or the same covering upper approximation. The concept of reducts of coverings is introduced and the procedure to find a reduct for a covering is given. It has been proved that the reduct of a covering is the minimal covering that generates the same covering lower approximation or the same covering upper approximation, so this concept is also a technique to get rid of redundancy in data mining. Furthermore, it has been shown that covering lower and upper approximations determine each other. Finally, a set of axioms is constructed to characterize the covering lower approximation operation.


Information Sciences | 2007

Topological approaches to covering rough sets

William Zhu

Rough sets, a tool for data mining, deal with the vagueness and granularity in information systems. This paper studies covering-based rough sets from the topological view. We explore the topological properties of this type of rough sets, study the interdependency between the lower and the upper approximation operations, and establish the conditions under which two coverings generate the same lower approximation operation and the same upper approximation operation. Lastly, axiomatic systems for the lower approximation operation and the upper approximation operation are constructed.


Information Sciences | 2007

Generalized rough sets based on relations

William Zhu

Rough set theory has been proposed by Pawlak as a tool for dealing with the vagueness and granularity in information systems. The core concepts of classical rough sets are lower and upper approximations based on equivalence relations. This paper studies arbitrary binary relation based generalized rough sets. In this setting, a binary relation can generate a lower approximation operation and an upper approximation operation, but some of common properties of classical lower and upper approximation operations are no longer satisfied. We investigate conditions for a relation under which these properties hold for the relation based lower and upper approximation operations.


Information Sciences | 2009

Relationship between generalized rough sets based on binary relation and covering

William Zhu

Rough set theory is a powerful tool for dealing with uncertainty, granularity, and incompleteness of knowledge in information systems. This paper systematically studies a type of generalized rough sets based on covering and the relationship between this type of covering-based rough sets and the generalized rough sets based on binary relation. Firstly, we present basic concepts and properties of this kind of rough sets. Then we investigate the relationships between this type of generalized rough sets and other five types of covering-based rough sets. The major contribution in this paper is that we establish the equivalency between this type of covering-based rough sets and a type of binary relation based rough sets. Through existing results in binary relation based rough sets, we present axiomatic systems for this type of covering-based lower and upper approximation operations. In addition, we explore the relationships among several important concepts such as minimal description, reduction, representative covering, exact covering, and unary covering in covering-based rough sets. Investigation of this type of covering-based will benefit to our understanding of other types of rough sets based on covering and binary relation.


Information Sciences | 2011

Test-cost-sensitive attribute reduction

Fan Min; Huaping He; Yuhua Qian; William Zhu

In many data mining and machine learning applications, there are two objectives in the task of classification; one is decreasing the test cost, the other is improving the classification accuracy. Most existing research work focuses on the latter, with attribute reduction serving as an optional pre-processing stage to remove redundant attributes. In this paper, we point out that when tests must be undertaken in parallel, attribute reduction is mandatory in dealing with the former objective. With this in mind, we posit the minimal test cost reduct problem which constitutes a new, but more general, difficulty than the classical reduct problem. We also define three metrics to evaluate the performance of reduction algorithms from a statistical viewpoint. A framework for a heuristic algorithm is proposed to deal with the new problem; specifically, an information gain-based @l-weighted reduction algorithm is designed, where weights are decided by test costs and a non-positive exponent @l, which is the only parameter set by the user. The algorithm is tested with three representative test cost distributions on four UCI (University of California - Irvine) datasets. Experimental results show that there is a trade-off while setting @l, and a competition approach can improve the quality of the result significantly. This study suggests potential application areas and new research trends concerning attribute reduction.


Information Sciences | 2009

Relationship among basic concepts in covering-based rough sets

William Zhu

Uncertainty and incompleteness of knowledge are widespread phenomena in information systems. Rough set theory is a tool for dealing with granularity and vagueness in data analysis. Rough set method has already been applied to various fields such as process control, economics, medical diagnosis, biochemistry, environmental science, biology, chemistry, psychology, and conflict analysis. Covering-based rough set theory is an extension to classical rough sets. In covering-based rough sets, there exist several basic concepts such as reducible elements of a covering, minimal descriptions, unary coverings, and the property that the intersection of any two elements is the union of finite elements in this covering. These concepts appeared in the literature of covering-based rough sets separately. In this paper we study the relationships between them. In particular, we establish the equivalence of the unary covering and the covering with the property that the intersection of any two elements is the union of finite elements in this covering. We also investigate the relationship between the covering lower approximation operation and the interior operator. A characterization of the interior operator by the covering lower approximation operation is presented in this paper. Correspondingly, we study the relationship between the covering upper approximation operation and the closure operator. In addition, we explore the conditions under which the covering upper approximation operation is monotone. The study of the relationships between these concepts will help us have a better understanding of covering-based rough sets.


Information Sciences | 2008

The algebraic structures of generalized rough set theory

Guilong Liu; William Zhu

Rough set theory is an important technique for knowledge discovery in databases, and its algebraic structure is part of the foundation of rough set theory. In this paper, we present the structures of the lower and upper approximations based on arbitrary binary relations. Some existing results concerning the interpretation of belief functions in rough set backgrounds are also extended. Based on the concepts of definable sets in rough set theory, two important Boolean subalgebras in the generalized rough sets are investigated. An algorithm to compute atoms for these two Boolean algebras is presented.


Information Sciences | 2012

Attribute reduction of data with error ranges and test costs

Fan Min; William Zhu

In data mining applications, we have a number of measurement methods to obtain a data item with different test costs and different error ranges. Test costs refer to time, money, or other resources spent in obtaining data items related to some object; observational errors correspond to differences in measured and true value of a data item. In supervised learning, we need to decide which data items to obtain and which measurement methods to employ, so as to minimize the total test cost and help in constructing classifiers. This paper studies this problem in four steps. First, data models are built to address error ranges and test costs. Second, error-range-based covering rough set is constructed to define lower and upper approximations, positive regions, and relative reducts. A closely related theory deals with neighborhood rough set, which has been successfully applied to heterogeneous attribute reduction. The major difference between the two theories is the definition of neighborhood. Third, the minimal test cost attribute reduction problem is redefined in the new theory. Fourth, both backtrack and heuristic algorithms are proposed to deal with the new problem. The algorithms are tested on ten UCI (University of California - Irvine) datasets. Experimental results show that the backtrack algorithm is efficient on rational-sized datasets, the weighting mechanism for the heuristic information is effective, and the competition approach can improve the quality of the result significantly. This study suggests new research trends concerning attribute reduction and covering rough set.


Information Systems | 2006

A New Type of Covering Rough Set

William Zhu; Fei-Yue Wang

Rough sets, a tool for data mining, deal with the vagueness and granularity in information systems. This paper studies a type of covering generalized rough sets. After presenting their basic properties, this paper explores the inter dependency between the lower and the upper approximation operations, conditions under which two coverings generate a same upper approximation operation, and the axiomatic systems for these operations. In the end, this paper establishes the relationships between this type of covering rough sets and the other covering rough sets in literature


International Journal of Approximate Reasoning | 2014

Feature selection with test cost constraint

Fan Min; Qinghua Hu; William Zhu

Feature selection is an important preprocessing step in machine learning and data mining. In real-world applications, costs, including money, time and other resources, are required to acquire the features. In some cases, there is a test cost constraint due to limited resources. We shall deliberately select an informative and cheap feature subset for classification. This paper proposes the feature selection with test cost constraint problem for this issue. The new problem has a simple form while described as a constraint satisfaction problem (CSP). Backtracking is a general algorithm for CSP, and it is efficient in solving the new problem on medium-sized data. As the backtracking algorithm is not scalable to large datasets, a heuristic algorithm is also developed. Experimental results show that the heuristic algorithm can find the optimal solution in most cases. We also redefine some existing feature selection problems in rough sets, especially in decision-theoretic rough sets, from the viewpoint of CSP. These new definitions provide insight to some new research directions.

Collaboration


Dive into the William Zhu's collaboration.

Top Co-Authors

Avatar

Fan Min

Zhangzhou Normal University

View shared research outputs
Top Co-Authors

Avatar

Hong Zhao

Zhangzhou Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fei-Yue Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Qingxin Zhu

University of Electronic Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar

Yanfang Liu

Zhangzhou Normal University

View shared research outputs
Top Co-Authors

Avatar

Aiping Huang

Zhangzhou Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kun She

University of Electronic Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge