Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pradipta Maji is active.

Publication


Featured researches published by Pradipta Maji.


systems man and cybernetics | 2007

Rough Set Based Generalized Fuzzy

Pradipta Maji; Sankar K. Pal

A generalized hybrid unsupervised learning algorithm, which is termed as rough-fuzzy possibilistic C-means (RFPCM), is proposed in this paper. It comprises a judicious integration of the principles of rough and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in class definition, the membership function of fuzzy sets enables efficient handling of overlapping partitions. It incorporates both probabilistic and possibilistic memberships simultaneously to avoid the problems of noise sensitivity of fuzzy C-means and the coincident clusters of PCM. The concept of crisp lower bound and fuzzy boundary of a class, which is introduced in the RFPCM, enables efficient selection of cluster prototypes. The algorithm is generalized in the sense that all existing variants of C-means algorithms can be derived from the proposed algorithm as a special case. Several quantitative indices are introduced based on rough sets for the evaluation of performance of the proposed C-means algorithm. The effectiveness of the algorithm, along with a comparison with other algorithms, has been demonstrated both qualitatively and quantitatively on a set of real-life data sets.


International Journal of Approximate Reasoning | 2011

C

Pradipta Maji; Sushmita Paul

Among the large amount of genes presented in microarray gene expression data, only a small fraction of them is effective for performing a certain diagnostic test. In this regard, a new feature selection algorithm is presented based on rough set theory. It selects a set of genes from microarray data by maximizing the relevance and significance of the selected genes. A theoretical analysis is presented to justify the use of both relevance and significance criteria for selecting a reduced gene set with high predictive accuracy. The importance of rough set theory for computing both relevance and significance of the genes is also established. The performance of the proposed algorithm, along with a comparison with other related methods, is studied using the predictive accuracy of K-nearest neighbor rule and support vector machine on five cancer and two arthritis microarray data sets. Among seven data sets, the proposed algorithm attains 100% predictive accuracy for three cancer and two arthritis data sets, while the rough set based two existing algorithms attain this accuracy only for one cancer data set.


systems man and cybernetics | 2010

-Means Algorithm and Quantitative Indices

Pradipta Maji; Sankar K. Pal

Several information measures such as entropy, mutual information, and f-information have been shown to be successful for selecting a set of relevant and nonredundant genes from a high-dimensional microarray data set. However, for continuous gene expression values, it is very difficult to find the true density functions and to perform the integrations required to compute different information measures. In this regard, the concept of the fuzzy equivalence partition matrix is presented to approximate the true marginal and joint distributions of continuous gene expression values. The fuzzy equivalence partition matrix is based on the theory of fuzzy-rough sets, where each row of the matrix represents a fuzzy equivalence partition that can automatically be derived from the given expression values. The performance of the proposed approach is compared with that of existing approaches using the class separability index and the predictive accuracy of the support vector machine. An important finding, however, is that the proposed approach is shown to be effective for selecting relevant and nonredundant continuous-valued genes from microarray data.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013

Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data

Pradipta Maji; Sushmita Paul

Gene expression data clustering is one of the important tasks of functional genomics as it provides a powerful tool for studying functional relationships of genes in a biological process. Identifying coexpressed groups of genes represents the basic challenge in gene clustering problem. In this regard, a gene clustering algorithm, termed as robust rough-fuzzy c-means, is proposed judiciously integrating the merits of rough sets and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in cluster definition, the integration of probabilistic and possibilistic memberships of fuzzy sets enables efficient handling of overlapping partitions in noisy environment. The concept of possibilistic lower bound and probabilistic boundary of a cluster, introduced in robust rough-fuzzy c-means, enables efficient selection of gene clusters. An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed c-means algorithm to converge to an optimum or near optimum solutions and helps to discover coexpressed gene clusters. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated both qualitatively and quantitatively on 14 yeast microarray data sets.


Fuzzy Sets and Systems | 2009

Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data

Minakshi Banerjee; Malay K. Kundu; Pradipta Maji

This paper presents a new image retrieval scheme using visually significant point features. The clusters of points around significant curvature regions (high, medium, and weak type) are extracted using a fuzzy set theoretic approach. Some invariant color features are computed from these points to evaluate the similarity between images. A set of relevant and non-redundant features is selected using the mutual information based minimum redundancy-maximum relevance framework. The relative importance of each feature is evaluated using a fuzzy entropy based measure, which is computed from the sets of retrieved images marked relevant and irrelevant by the users. The performance of the system is evaluated using different sets of examples from a general purpose image database. The robustness of the system is also shown when the images undergo different transformations.


IEEE Transactions on Knowledge and Data Engineering | 2010

Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

Pradipta Maji; Sankar K. Pal

The selection of nonredundant and relevant features of real-valued data sets is a highly challenging problem. A novel feature selection method is presented here based on fuzzy-rough sets by maximizing the relevance and minimizing the redundancy of the selected features. By introducing the fuzzy equivalence partition matrix, a novel representation of Shannons entropy for fuzzy approximation spaces is proposed to measure the relevance and redundancy of features suitable for real-valued data sets. The fuzzy equivalence partition matrix also offers an efficient way to calculate many more information measures, termed as f-information measures. Several f-information measures are shown to be effective for selecting nonredundant and relevant features of real-valued data sets. This paper compares the performance of different f-information measures for feature selection in fuzzy approximation spaces. Some quantitative indexes are introduced based on fuzzy-rough sets for evaluating the performance of proposed method. The effectiveness of the proposed method, along with a comparison with other methods, is demonstrated on a set of real-life data sets.


IEEE Transactions on Knowledge and Data Engineering | 2007

Content-based image retrieval using visually significant point features

Pradipta Maji; Sankar K. Pal

In most pattern recognition algorithms, amino acids cannot be used directly as inputs since they are nonnumerical variables. They, therefore, need encoding prior to input. In this regard, bio-basis function maps a nonnumerical sequence space to a numerical feature space. It is designed using an amino acid mutation matrix. One of the important issues for the bio-basis function is how to select the minimum set of bio-bases with maximum information. In this paper, we describe an algorithm, termed as rough-fuzzy c{\hbox{-}}{\rm{medoids}} (RFCMdd) algorithm, to select the most informative bio-bases. It is comprised of a judicious integration of the principles of rough sets, fuzzy sets, the c{\hbox{-}}{\rm{medoids}} algorithm, and the amino acid mutation matrix. While the membership function of fuzzy sets enables efficient handling of overlapping partitions, the concept of lower and upper bounds of rough sets deals with uncertainty, vagueness, and incompleteness in class definition. The concept of crisp lower bound and fuzzy boundary of a class, introduced in RFCMdd, enables efficient selection of the minimum set of the most informative bio-bases. Some new indices are introduced for evaluating quantitatively the quality of selected bio-bases. The effectiveness of the proposed algorithm, along with a comparison with other algorithms, has been demonstrated on different types of protein data sets.


Archive | 2012

Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces

Pradipta Maji; Sankar K. Pal

Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processingEmphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems development, making it easier to master such tasks as classification, clustering, and feature selection.Rough-Fuzzy Pattern Recognition examines the important underlying theory as well as algorithms and applications, helping readers see the connections between theory and practice. The first chapter provides an introduction to pattern recognition and data mining, including the key challenges of working with high-dimensional, real-life data sets. Next, the authors explore such topics and issues as:Soft computing in pattern recognition and data miningA Mathematical framework for generalized rough sets, incorporating the concept of fuzziness in defining the granules as well as the setSelection of non-redundant and relevant features of real-valued data setsSelection of the minimum set of basis strings with maximum information for amino acid sequence analysisSegmentation of brain MR images for visualization of human tissuesNumerous examples and case studies help readers better understand how pattern recognition models are developed and used in practice. This textcovering the latest findings as well as directions for future researchis recommended for both students and practitioners working in systems design, pattern recognition, image analysis, data mining, bioinformatics, soft computing, and computational intelligence.


IEICE Transactions on Information and Systems | 2005

Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis

Pradipta Maji; Parimal Pal Chaudhuri

This paper investigates the application of the computational model of Cellular Automata (CA) for pattern classification of real valued data. A special class of CA referred to as Fuzzy CA (FCA) is employed to design the pattern classifier. It is a natural extension of conventional CA, which operates on binary string employing boolean logic as next state function of a cell. By contrast, FCA employs fuzzy logic suitable for modeling real valued functions. A matrix algebraic formulation has been proposed for analysis and synthesis of FCA. An efficient formulation of Genetic Algorithm (GA) is reported for evolution of desired FCA to be employed as a classifier of datasets having attributes expressed as real numbers. Extensive experimental results confirm the scalability of the proposed FCA based classifier to handle large volume of datasets irrespective of the number of classes, tuples, and attributes. Excellent classification accuracy has established the FCA based pattern classifier as an efficient and cost-effective solutions for the classification problem.


database systems for advanced applications | 2004

Rough-Fuzzy Pattern Recognition: Applications in Bioinformatics and Medical Imaging

Pradipta Maji; Parimal Pal Chaudhuri

This paper presents a pattern classifier to handle real valued patterns. A special class of Fuzzy Cellular Automata (FCA), referred to as Fuzzy Multiple Attractor Cellular Automata (FMACA), is employed to design the pattern classifier. The analysis reported in this paper has established the FMACA as an efficient pattern classifier for real valued patterns. Excellent classification accuracy and low memory overhead of FMACA based pattern classifier have been demonstrated through extensive experimental results.

Collaboration


Dive into the Pradipta Maji's collaboration.

Top Co-Authors

Avatar

Sushmita Paul

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Parimal Pal Chaudhuri

Netaji Subhash Engineering College

View shared research outputs
Top Co-Authors

Avatar

Sankar K. Pal

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Chandra Das

Netaji Subhash Engineering College

View shared research outputs
Top Co-Authors

Avatar

Biplab Sikdar

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Niloy Ganguly

Indian Institute of Engineering Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Shaswati Roy

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Abhirup Banerjee

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Partha Garai

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Ankita Mandal

Indian Statistical Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge