Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chandra Das is active.

Publication


Featured researches published by Chandra Das.


IEEE Transactions on Nanobioscience | 2012

Relevant and Significant Supervised Gene Clusters for Microarray Cancer Classification

Pradipta Maji; Chandra Das

An important application of microarray data in functional genomics is to classify samples according to their gene expression profiles such as to classify cancer versus normal samples or to classify different types or subtypes of cancer. One of the major tasks with gene expression data is to find co-regulated gene groups whose collective expression is strongly associated with sample categories. In this regard, a gene clustering algorithm is proposed to group genes from microarray data. It directly incorporates the information of sample categories in the grouping process for finding groups of co-regulated genes with strong association to the sample categories, yielding a supervised gene clustering algorithm. The average expression of the genes from each cluster acts as its representative. Some significant representatives are taken to form the reduced feature set to build the classifiers for cancer classification. The mutual information is used to compute both gene-gene redundancy and gene-class relevance. The performance of the proposed method, along with a comparison with existing methods, is studied on six cancer microarray data sets using the predictive accuracy of naive Bayes classifier, K-nearest neighbor rule, and support vector machine. An important finding is that the proposed algorithm is shown to be effective for identifying biologically significant gene clusters with excellent predictive capability.


international conference on communications | 2012

A novel interpolation based missing value estimation method to predict missing values in microarray gene expression data

Shilpi Bose; Chandra Das; Sourav Dutta; Samiran Chattopadhyay

Microarray experiments can generate data sets with multiple missing expression values, normally due to various experimental problems. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. Thereore, effective missing value estimation methods are essential to minimize the effect of incomplete data sets on analysis, and to increase the range of data sets to which these algorithms can be applied. In this regard, a new interpolation based imputation method is proposed to predict missing values in microarray gene expression data. The proposed method selects a subset of similar genes and a subset of similar samples with respect to each missing position and then applies interpolation in a novel manner to predict that missing value. The performance of the proposed method is studied based on the normalized root mean square error with existing estimation techniques including K-nearest neighbor (KNN), Sequential K-nearest neighbor (SKNN) and Iterative K-nearest neighbor (IKNN). The effectiveness of the proposed method, along with a comparison with existing methods, is demonstrated on different microarray data sets.


International Journal of Machine Learning and Cybernetics | 2015

Possibilistic biclustering algorithm for discovering value-coherent overlapping δ-biclusters

Chandra Das; Pradipta Maji

One of the important tools for analyzing gene expression data is biclustering method. It focuses on finding a subset of genes and a subset of experimental conditions that together exhibit coherent behavior. However, most of the existing biclustering algorithms find exclusive biclusters, which is inappropriate in the context of biology. Since biological processes are not independent of each other, many genes may participate in multiple different processes. Hence, nonexclusive biclustering algorithms are required for finding overlapping biclusters. In this regard, a novel possibilistic biclustering algorithm is presented here to find highly overlapping biclusters of larger volume with mean squared residue lower than a predefined threshold. It judiciously incorporates the concept of possibilistic clustering algorithm into biclustering framework. The integration enables efficient selection of highly overlapping coherent biclusters with mean squared residue lower than a given threshold. The detailed formulation of the proposed possibilistic biclustering algorithm, along with a mathematical analysis on the convergence property, is presented. Some quantitative indices are introduced for evaluating the quality of generated biclusters. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated both qualitatively and quantitatively on yeast gene expression data set. In general, the proposed algorithm shows excellent performance at finding patterns in gene expression data.


ACITY (2) | 2013

Effectiveness of Different Partition Based Clustering Algorithms for Estimation of Missing Values in Microarray Gene Expression Data

Shilpi Bose; Chandra Das; Abirlal Chakraborty; Samiran Chattopadhyay

Microarray experiments normally produce data sets with multiple missing expression values, due to various experimental problems. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene expression values as input. Therefore, effective missing value estimation methods are needed to minimize the effect of incomplete data during analysis of gene expression data using these algorithms. In this paper, missing values in different microarray data sets are estimated using different partition-based clustering algorithms to emphasize the fact that clustering based methods are also useful tool for prediction of missing values. However, clustering approaches have not been yet highlighted to predict missing values in gene expression data. The estimation accuracy of different clustering methods are compared with the widely used KNNimpute and SKNNimpute methods on various microarray data sets with different rate of missing entries. The experimental results show the effectiveness of clustering based methods compared to other existing methods in terms of Root Mean Square error.


International Journal of Bioinformatics Research and Applications | 2016

A novel distance-based iterative sequential KNN algorithm for estimation of missing values in microarray gene expression data

Chandra Das; Shilpi Bose; Matangini Chattopadhyay; Samiran Chattopadhyay

The presence of missing entries in DNA microarray gene expression datasets creates severe problems in downstream analysis because they require complete datasets. Though several missing value prediction methods have been proposed to solve this problem, they have limitations which may affect the performance of various analysis algorithms. In this regard, a novel distance based iterative sequential K-nearest neighbour imputation method ISKNNimpute has been proposed. The proposed distance is a hybridisation of modified Euclidean distance and Pearson correlation coefficient. The proposed method is a modification of KNN estimation in which the concept of reuse of estimation is considered using both iterative and sequential approach. The performance of the proposed ISKNNimpute method is tested on various time-series and non time-series microarray datasets comparing with several widely used existing imputation techniques. The experimental results confirm that the ISKNNimpute method consistently generates better results compared to other existing methods.


2015 International Conference on Man and Machine Interfacing (MAMI) | 2015

A novel biclustering based missing value prediction method for microarray gene expression data

Samiran Chattopadhyay; Chandra Das; Shilpi Bose

The presence of missing values in microarray gene expression data creates severe problem during downstream data analysis as analysis algorithms require complete gene expression profile. In order to get rid of these missing entries effective missing value prediction methods are essential to generate complete data. In this regard, a new biclustering based sequential missing value imputation method is proposed here to predict missing values in microarray gene expression data. Starting from the gene with lowest missing rate, for each missing position, the proposed method computes a bicluster by selecting a subset of similar genes and a subset of similar samples or conditions using a novel distance measure. Then the imputation is carried out sequentially by computing the weighted average of the neighbour genes and samples. To evaluate the performance, the proposed method is rigorously tested and compared with some of the well known existing methods. The effectiveness of the proposed method, is demonstrated on different microarray data sets including time series, non time series, and mixed.


international conference on advanced computing | 2013

A Modified Local Least Squares-Based Missing Value Estimation Method in Microarray Gene Expression Data

Shilpi Bose; Chandra Das; Tamaghna Gangopadhyay; Samiran Chattopadhyay

Micro array gene expression data often contains missing values normally due to various experimental reasons. However, most of the gene expression data analysis algorithms, such as clustering, classification and network design, require a complete matrix of gene array during analysis. It is therefore very important to accurately impute the missing values before applying the data analysis algorithms. In this paper, a modified Local Least Square imputation based algorithm known as NSLLSimpute has been introduced which overcomes the drawbacks of previously developed LLSimpute and SLLSimpute algorithms. The performance of NSLLSimpute algorithm is compared with the most commonly used imputation methods like K-nearest neighbor imputation (KNNimpute), Sequential K-nearest neighbor imputation (SKNNimpute), Iterative K-nearest neighbor imputation (IKNNimpute), Singular Value Decomposition (SVDimpute), Local Least Squares imputation (LLSimpute) and Sequential Local Least Squares imputation (SLLSimpute) in terms estimation accuracy using Root Mean Square error when applied on four publicly available micro array data sets over different rates of randomly introduced missing entries.


IEEE Transactions on Nanobioscience | 2010

Protein Functional Sites Prediction Using Modified Bio-Basis Function and Quantitative Indices

Pradipta Maji; Chandra Das

The prediction of functional sites in proteins is an important issue in protein function studies and drug design. To apply the kernel based pattern recognition algorithms such as support vector machines for protein functional sites prediction, a new string kernel function, termed as the modified bio-basis function, is proposed recently. The bio-basis strings for the new kernel function are selected by an efficient method that integrates the Fisher ratio and the concept of degree of resemblance. In this regard, this paper introduces some quantitative indices for evaluating the quality of selected bio-basis strings. Moreover, the effectiveness of the new string kernel function and bio-basis string selection method, along with a comparison with existing bio-basis function and related bio-basis string selection methods, is demonstrated on different protein data sets using the proposed quantitative indices and support vector machines.


IEEE Transactions on Nanobioscience | 2010

Efficient Design of Bio-Basis Function to Predict Protein Functional Sites Using Kernel-Based Classifiers

Pradipta Maji; Chandra Das

In order to apply the powerful kernel-based pattern recognition algorithms such as support vector machines to predict functional sites in proteins, amino acids need encoding prior to input. In this regard, a new string kernel function, termed as the modified bio-basis function, is proposed that maps a nonnumerical sequence space to a numerical feature space. The proposed string kernel function is developed based on the conventional bio-basis function and needs a bio-basis string as a support like conventional kernel function. The concept of zone of influence of a bio-basis string is introduced in the proposed kernel function to take into account the influence of each bio-basis string in nonnumerical sequence space. An efficient method is described to select a set of bio-basis strings for the proposed kernel function, integrating the Fisher ratio and a novel concept of degree of resemblance. The integration enables the method to select a reduced set of relevant and nonredundant bio-basis strings.


CSI Transactions on ICT | 2014

Random number generators: performance comparison of ELCA and MaxCA

Arnab Mitra; Anirban Kundu; Chandra Das

In this paper, we have compared the performances of different cellular automata based random number generators to emphasize on the quality of randomness with a focus on cost effectiveness for concerned fault coverage. This research includes the study of maximum length cellular automata random number generator and proposed equal length cellular automata random number generator. It is found from the experimental results that resulting sequences have significant improvement in terms of randomness quality and associated fault coverage in their generation procedures. The different complexities associated considered here for generation of random numbers, are: space complexity, time complexity, design complexity and searching complexity.

Collaboration


Dive into the Chandra Das's collaboration.

Top Co-Authors

Avatar

Pradipta Maji

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shilpi Bose

Netaji Subhash Engineering College

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anirban Kundu

Netaji Subhash Engineering College

View shared research outputs
Top Co-Authors

Avatar

Arnab Mitra

West Bengal University of Technology

View shared research outputs
Top Co-Authors

Avatar

Parimal Pal Chaudhuri

Netaji Subhash Engineering College

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge