Pawan Lingras | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pawan Lingras is active.

Explore More

Publication

Featured researches published by Pawan Lingras.

intelligent information systems | 2004

Interval Set Clustering of Web Users with Rough K -Means

Pawan Lingras; Chad West

Data collection and analysis in web mining faces certain unique challenges. Due to a variety of reasons inherent in web browsing and web logging, the likelihood of bad or incomplete data is higher than conventional applications. The analytical techniques in web mining need to accommodate such data. Fuzzy and rough sets provide the ability to deal with incomplete and approximate information. Fuzzy set theory has been shown to be useful in three important aspects of web and data mining, namely clustering, association, and sequential analysis. There is increasing interest in research on clustering based on rough set theory. Clustering is an important part of web mining that involves finding natural groupings of web resources or web users. Researchers have pointed out some important differences between clustering in conventional applications and clustering in web mining. For example, the clusters and associations in web mining do not necessarily have crisp boundaries. As a result, researchers have studied the possibility of using fuzzy sets in web mining clustering applications. Recent attempts have used genetic algorithms based on rough set theory for clustering. However, the genetic algorithms based clustering may not be able to handle the large amount of data typical in a web mining application. This paper proposes a variation of the K-means clustering algorithm based on properties of rough sets. The proposed algorithm represents clusters as interval or rough sets. The paper also describes the design of an experiment including data collection and the clustering process. The experiment is used to create interval set representations of clusters of web visitors.

knowledge discovery and data mining | 1998

Data mining using extensions of the rough set model

Pawan Lingras; Yiyu Yao

This article examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. Recent research has shown that a generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules from a complete database. This article demonstrates that a generalized rough set model can be used for generating rules from incomplete databases. These rules are based on plausibility functions proposed by Shafer. The article also discusses the importance of rule extraction from incomplete databases in data mining.

soft computing | 1998

Interpretations of belief functions in the theory of rough sets

Yiyu Yao; Pawan Lingras

Abstract This paper reviews and examines interpretations of belief functions in the theory of rough sets with finite universe. The concept of standard rough set algebras is generalized in two directions. One is based on the use of nonequivalence relations. The other is based on relations over two universes, which leads to the notion of interval algebras. Pawlak rough set algebras may be used to interpret belief functions whose focal elements form a partition of the universe. Generalized rough set algebras using nonequivalence relations may be used to interpret belief functions which have less than | U | focal elements, where | U | is the cardinality of the universe U on which belief functions are defined. Interval algebras may be used to interpret any belief functions.

Information Sciences | 2007

Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification

Pawan Lingras; Cory J. Butz

Support vector machines (SVMs) are essentially binary classifiers. To improve their applicability, several methods have been suggested for extending SVMs for multi-classification, including one-versus-one (1-v-1), one-versus-rest (1-v-r) and DAGSVM. In this paper, we first describe how binary classification with SVMs can be interpreted using rough sets. A rough set approach to SVM classification removes the necessity of exact classification and is especially useful when dealing with noisy data. Next, by utilizing the boundary region in rough sets, we suggest two new approaches, extensions of 1-v-r and 1-v-1, to SVM multi-classification that allow for an error rate. We explicitly demonstrate how our extended 1-v-r may shorten the training time of the conventional 1-v-r approach. In addition, we show that our 1-v-1 approach may have reduced storage requirements compared to the conventional 1-v-1 and DAGSVM techniques. Our techniques also provide better semantic interpretations of the classification process. The theoretical conclusions are supported by experimental findings involving a synthetic dataset.

IEEE Transactions on Knowledge and Data Engineering | 2009

Rough Cluster Quality Index Based on Decision Theory

Pawan Lingras; Min Chen; Duoqian Miao

Quality of clustering is an important issue in application of clustering techniques. Most traditional cluster validity indices are geometry-based cluster quality measures. This paper proposes a cluster validity index based on the decision-theoretic rough set model by considering various loss functions. Experiments with synthetic, standard, and real-world retail data show the usefulness of the proposed validity index for the evaluation of rough and crisp clustering. The measure is shown to help determine optimal number of clusters, as well as an important parameter called threshold in rough clustering. The experiments with a promotional campaign for the retail data illustrate the ability of the proposed measure to incorporate financial considerations in evaluating quality of a clustering scheme. This ability to deal with monetary values distinguishes the proposed decision-theoretic measure from other distance-based measures. The proposed validity index can also be extended for evaluating other clustering algorithms such as fuzzy clustering.

intelligent information systems | 2001

Unsupervised Rough Set Classification Using GAs

Pawan Lingras

The rough set is a useful notion for the classification of objects when the available information is not adequate to represent classes using precise sets. Rough sets have been successfully used in information systems for learning rules from an expert. This paper describes how genetic algorithms can be used to develop rough sets. The proposed rough set theoretic genetic encoding will be especially useful in unsupervised learning. A rough set genome consists of upper and lower bounds for sets in a partition. The partition may be as simple as the conventional expert class and its complement or a more general classification scheme. The paper provides a complete description of design and implementation of rough set genomes. The proposed design and implementation is used to provide an unsupervised rough set classification of highway sections.

International Journal of Approximate Reasoning | 2013

Soft clustering -- Fuzzy and rough approaches and their extensions and derivatives

Georg Peters; Fernando Crespo; Pawan Lingras; Richard Weber

Clustering is one of the most widely used approaches in data mining with real life applications in virtually any domain. The huge interest in clustering has led to a possibly three-digit number of algorithms with the k-means family probably the most widely used group of methods. Besides classic bivalent approaches, clustering algorithms belonging to the domain of soft computing have been proposed and successfully applied in the past four decades. Bezdeks fuzzy c-means is a prominent example for such soft computing cluster algorithms with many effective real life applications. More recently, Lingras and West enriched this area by introducing rough k-means. In this article we compare k-means to fuzzy c-means and rough k-means as important representatives of soft clustering. On the basis of this comparison, we then survey important extensions and derivatives of these algorithms; our particular interest here is on hybrid clustering, merging fuzzy and rough concepts. We also give some examples where k-means, rough k-means, and fuzzy c-means have been used in studies.

Information Sciences | 1998

Comparison of neofuzzy and rough neural networks

Pawan Lingras

Abstract Conventional neural network architectures generally lack semantics. Both rough and neofuzzy neurons introduce semantic structures in the conventional neural network models. Rough neurons make it possible to process data points with a range of values instead of a single precise value. Neofuzzy neurons make it possible to convert crisp values into fuzzy values. This paper compares rough and neofuzzy neural networks. Rough and neofuzzy neurons are demonstrated to be complementary to each other. It is shown that the introduction of rough and fuzzy semantic structures in neural networks can increase the accuracy of predictions.

ieee international conference on fuzzy systems | 2002

Rough set clustering for Web mining

Pawan Lingras

Similar to traditional data mining, three important Web mining operations include clustering, association, and sequential analysis. Typical clustering operations in Web mining involve finding natural groupings of Web resources or Web users. Researchers have pointed out some important differences between clustering in conventional applications and clustering in Web mining. For example, the clusters and associations in Web mining do not necessarily have crisp boundaries. Moreover, due to a variety of reasons inherent in Web browsing and Web logging, the likelihood of bad or incomplete data is higher. As a result, researchers have studied the possibility of using fuzzy sets in Web mining clustering applications. The paper describes how rough set theory can also be used to develop clustering schemes for Web mining. The unsupervised classification described in the paper uses properties of rough sets along with genetic algorithms to represent clusters as interval sets. The paper also describes the design of an experiment including data collection and the clustering process. The experiment is used to create interval set representations of groups of Web visitors.

European Journal of Operational Research | 2003

Genetic algorithms for rerouting shortest paths in dynamic and stochastic networks

Cedric Davies; Pawan Lingras

Abstract This paper considers the problem of finding the shortest path in a dynamic network, where the weights change as yet-to-be-known functions of time. Routing decisions are based on constantly changing predictions of the weights. The problem has some useful applications in computer and highway networks. The Genetic Algorithm (GA) based strategy presented in this paper, adapts to the changing network information by rerouting during the course of its execution. The paper describes the implementation of the algorithm and results of experiments. A brief discussion on potential applications is also provided.

Explore More