Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Patrick G. Clark is active.

Publication


Featured researches published by Patrick G. Clark.


International Journal of Approximate Reasoning | 2014

Generalized probabilistic approximations of incomplete data

Jerzy W. Grzymala-Busse; Patrick G. Clark; Martin Kuehnhausen

In this paper we discuss a generalization of the idea of probabilistic approximations. Probabilistic (or parameterized) approximations, studied mostly in variable precision rough set theory, were originally defined using equivalence relations. Recently, probabilistic approximations were defined for arbitrary binary relations. Such approximations have an immediate application to data mining from incomplete data because incomplete data sets are characterized by a characteristic relation which is reflexive but not necessarily symmetric or transitive. In contrast, complete data sets are described by indiscernibility which is an equivalence relation. The main objective of this paper was to compare experimentally, for the first time, two generalizations of probabilistic approximations: global and local. Additionally, we explored the problem how many distinct probabilistic approximations may be defined for a given data set.


granular computing | 2011

Experiments on probabilistic approximations

Patrick G. Clark; Jerzy W. Grzymala-Busse

Recently much attention has been paid to probabilistic (parameterized) approximations that are generalizations of ordinary lower and upper approximations known from rough set theory. The first objective of this paper is to compare the quality of such approximations and ordinary, lower and upper approximations. The second objective is to show that the number of distinct probabilistic approximations is quite limited. In our experiments we used six real-life data sets. Obviously, inconsistent data sets are required for such experiments, so the level of consistency in all data sets used for our experiments was decreased to enhance our experiments. Our main result is rather pessimistic: probabilistic approximations, different from ordinary lower or upper approximations, were better than ordinary approximations for only two out of these six data sets.


Information Sciences | 2014

Mining incomplete data with singleton, subset and concept probabilistic approximations

Patrick G. Clark; Jerzy W. Grzymala-Busse; Wojciech Rzasa

Abstract Rough set theory provides a very useful idea of lower and upper approximations for inconsistent data. For incomplete data these approximations are not unique. In this paper we investigate properties of three well-known generalizations of approximations: singleton, subset and concept. These approximations were recently further generalized as to include an additional parameter α , interpreted as a probability. In this paper we report novel properties of singleton, subset and concept probabilistic approximations. Additionally, we validated such approximations experimentally. Our main objective was to test which of the singleton, subset and concept probabilistic approximations are the most useful for data mining. Our conclusion is that, for a given incomplete data set, all three approaches should be applied and the best approach should be selected as a result of ten-fold cross validation. Finally, we conducted experiments on complexity of rule sets and the total number of singleton, subset and concept approximations.


international syposium on methodologies for intelligent systems | 2012

Local probabilistic approximations for incomplete data

Patrick G. Clark; Jerzy W. Grzymala-Busse; Martin Kuehnhausen

In this paper we introduce a generalization of the local approximation called a local probabilistic approximation. Our novel idea is associated with a parameter (probability) α. If α = 1, the local probabilistic approximation becomes a local lower approximation; for small α, it becomes a local upper approximation. The main objective of this paper is to test whether proper local probabilistic approximations (different from local lower and upper approximations) are better than ordinary local lower and upper approximations. Our experimental results, based on ten-fold cross validation, show that all depends on a data set: for some data sets proper local probabilistic approximations are better than local lower and upper approximations; for some data sets there is no difference, for yet other data sets proper local probabilistic approximations are worse than local lower and upper approximations.


International Conference on Rough Sets and Current Trends in Computing | 2012

How Good Are Probabilistic Approximations for Rule Induction from Data with Missing Attribute Values

Patrick G. Clark; Jerzy W. Grzymala-Busse; Zdzislaw S. Hippe

The main objective of our research was to test whether the probabilistic approximations should be used in rule induction from incomplete data. Probabilistic approximations, well known for many years, are used in variable precision rough set models and similar approaches to uncertainty.


granular computing | 2014

Mining incomplete data with lost values and attribute-concept values

Patrick G. Clark; Jerzy W. Grzymala-Busse

This paper presents novel research on an experimental comparison of two interpretations of missing attribute values: lost values and attribute-concept values. Experiments were conducted on 176 data sets, with preprocessing using three kinds of probabilistic approximations (lower, middle and upper) and then the MLEM2 rule induction system. The performance was evaluated using the error rate computed by ten-fold cross validation. Our main objective was to check which interpretation of the two missing attribute values is better in terms of the error rate. In our experiments, the better performance, in 10 out of 24 cases, is accomplished using lost values. In remaining 14 cases the difference in performance is not statistically significant (5% significance level).


international joint conference on rough sets | 2017

Characteristic Sets and Generalized Maximal Consistent Blocks in Mining Incomplete Data

Patrick G. Clark; Cheng Gao; Jerzy W. Grzymala-Busse; Teresa Mroczek

Mining incomplete data using approximations based on characteristic sets is a well-established technique. It is applicable to incomplete data sets with a few interpretations of missing attribute values, e.g., lost values and “do not care” conditions. Typically, probabilistic approximations are used in the process. On the other hand, maximal consistent blocks were introduced for incomplete data sets with only “do not care” conditions, using only lower and upper approximations. In this paper we introduce an extension of the maximal consistent blocks to incomplete data sets with any interpretation of missing attribute values and with probabilistic approximations. Additionally, we present results of experiments on mining incomplete data using both characteristic sets and maximal consistent blocks, using lost values and “do not care” conditions. We show that there is a small difference in quality of rule sets induced either way. However, characteristic sets can be computed in polynomial time while computing maximal consistent blocks is associated with exponential time complexity.


rough sets and knowledge technology | 2013

Generalizations of Approximations

Patrick G. Clark; Jerzy W. Grzymala-Busse; Wojciech Rząsa

In this paper we consider a generalization of the indiscernibility relation, i.e., a relation R that is not necessarily reflexive, symmetric, or transitive. There exist 36 basic definitions of lower and upper approximations based on such relation R. Additionally, there are six probabilistic approximations, generalizations of 12 corresponding lower and upper approximations. How to convert remaining 24 lower and upper approximations to 12 respective probabilistic approximations is an open problem.


rough sets and knowledge technology | 2014

An Analysis of Probabilistic Approximations for Rule Induction from Incomplete Data Sets

Patrick G. Clark; Jerzy W. Grzymala-Busse; Zdzislaw S. Hippe

The main objective of our research was to test whether the probabilistic approximations should be used in rule induction from incomplete data. For our research we designed experiments using six standard data sets. Four of the data sets were incomplete to begin with and two of the data sets had missing attribute values that were randomly inserted. In the six data sets, we used two interpretations of missing attribute values: lost values and “do not care” conditions. In addition we used three definitions of approximations: singleton, subset and concept. Among 36 combinations of a data set, type of missing attribute values and type of approximation, for five combinations the error rate (the result of ten-fold cross validation) was smaller than for ordinary (lower and upper) approximations; for other four combinations, the error rate was larger than for ordinary approximations. For the remaining 27 combinations, the difference between these error rates was not statistically significant.


International Conference on Rough Sets and Current Trends in Computing | 2014

A Comparison of Two Versions of the MLEM2 Rule Induction Algorithm Extended to Probabilistic Approximations

Patrick G. Clark; Jerzy W. Grzymala-Busse

A probabilistic approximation is a generalization of the standard idea of lower and upper approximations, defined for equivalence relations. Recently probabilistic approximations were additionally generalized to an arbitrary binary relation so that probabilistic approximations may be applied for incomplete data. We discuss two ways to induce rules from incomplete data using probabilistic approximations, by applying true MLEM2 algorithm and an emulated MLEM2 algorithm. In this paper we report novel research on a comparison of both approaches: new results of experiments on incomplete data with three interpretations of missing attribute values. Our results show that both approaches do not differ much.

Collaboration


Dive into the Patrick G. Clark's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zdzislaw S. Hippe

Rzeszów University of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge