Shubhra Sankar Ray | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shubhra Sankar Ray is active.

Explore More

Publication

Featured researches published by Shubhra Sankar Ray.

systems man and cybernetics | 2006

Evolutionary computation in bioinformatics: a review

Sankar K. Pal; Sanghamitra Bandyopadhyay; Shubhra Sankar Ray

This paper provides an overview of the application of evolutionary algorithms in certain bioinformatics tasks. Different tasks such as gene sequence analysis, gene mapping, deoxyribonucleic acid (DNA) fragment assembly, gene finding, microarray analysis, gene regulatory network analysis, phylogenetic trees, structure prediction and analysis of DNA, ribonucleic acid and protein, and molecular docking with ligand design are, first of all, described along with their basic features. The relevance of using evolutionary algorithms to these problems is then mentioned. These are followed by different approaches, along with their merits, for addressing some of the aforesaid tasks. Finally, some limitations of the current research activity are provided. An extensive bibliography is included

international conference on pattern recognition | 2004

New operators of genetic algorithms for traveling salesman problem

Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K. Pal

This paper describes an application of a genetic algorithm to the traveling salesman problem. A new knowledge based multiple inversion operator and a neighborhood swapping operator are proposed. Experimental results on different benchmark data sets have been found to provide superior results compared to some other existing methods.

Applied Intelligence | 2007

Genetic operators for combinatorial optimization in TSP and microarray gene ordering

Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K. Pal

This paper deals with some new operators of genetic algorithms and demonstrates their effectiveness to the traveling salesman problem (TSP) and microarray gene ordering. The new operators developed are nearest fragment operator based on the concept of nearest neighbor heuristic, and a modified version of order crossover operator. While these result in faster convergence of Genetic Algorithm (GAs) in finding the optimal order of genes in microarray and cities in TSP, the nearest fragment operator can augment the search space quickly and thus obtain much better results compared to other heuristics. Appropriate number of fragments for the nearest fragment operator and appropriate substring length in terms of the number of cities/genes for the modified order crossover operator are determined systematically. Gene order provided by the proposed method is seen to be superior to other related methods based on GAs, neural networks and clustering in terms of biological scores computed using categorization of the genes.

Neural Networks | 2013

Fuzzy rough sets, and a granular neural network for unsupervised feature selection

Avatharam Ganivada; Shubhra Sankar Ray; Sankar K. Pal

A granular neural network for identifying salient features of data, based on the concepts of fuzzy set and a newly defined fuzzy rough set, is proposed. The formation of the network mainly involves an input vector, initial connection weights and a target value. Each feature of the data is normalized between 0 and 1 and used to develop granulation structures by a user defined α-value. The input vector and the target value of the network are defined using granulation structures, based on the concept of fuzzy sets. The same granulation structures are also presented to a decision system. The decision system helps in extracting the domain knowledge about data in the form of dependency factors, using the notion of new fuzzy rough set. These dependency factors are assigned as the initial connection weights of the proposed network. It is then trained using minimization of a novel feature evaluation index in an unsupervised manner. The effectiveness of the proposed network, in evaluating selected features, is demonstrated on several real-life datasets. The results of FRGNN are found to be statistically more significant than related methods in 28 instances of 40 instances, i.e., 70% of instances, using the paired t-test.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013

RNA Secondary Structure Prediction Using Soft Computing

Shubhra Sankar Ray; Sankar K. Pal

Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned.

Frontiers in Genetics | 2012

HD-RNAS: An Automated Hierarchical Database of RNA Structures

Shubhra Sankar Ray; Sukanya Halder; Stephanie Kaypee; Dhananjay Bhattacharyya

One of the important goals of most biological investigations is to classify and organize the experimental findings so that they are readily useful for deriving generalized rules. Although there is a huge amount of information on RNA structures in PDB, there are redundant files, ambiguous synthetic sequences etc. Moreover, a systematic hierarchical organization, reflecting RNA classification, is missing in PDB. In this investigation, we have classified all the available RNA structures from PDB through a programmatic approach. Hence, it would be now a simple assignment to regularly update the classification as and when new structures are released. The classification can further determine (i) a non-redundant set of RNA structures and (ii) if available, a set of structures of identical sequence and function, which can highlight structural polymorphism, ligand-induced conformational alterations etc. Presently, we have classified the available structures (2095 PDB entries having RNA chain longer than nine nucleotides solved by X-ray crystallography or NMR spectroscopy) into nine functional classes. The structures of same function and same source are mostly seen to be similar with subtle differences depending on their functional complexation. The web-server is available online at http://www.saha.ac.in/biop/www/HD-RNAS.html and is updated regularly.

Gene | 2014

Entropic Biological Score: a cell cycle investigation for GRNs inference

Fabrício Martins Lopes; Shubhra Sankar Ray; Ronaldo Fumio Hashimoto; Roberto M. Cesar

Inference of gene regulatory networks (GRNs) is one of the most challenging research problems of Systems Biology. In this investigation, a new GRNs inference methodology, called Entropic Biological Score (EBS), which linearly combines the mean conditional entropy (MCE) from expression levels and a Biological Score (BS), obtained by integrating different biological data sources, is proposed. The EBS is validated with the Cell Cycle related functional annotation information, available from Munich Information Center for Protein Sequences (MIPS), and compared with some existing methods like MRNET, ARACNE, CLR and MCE for GRNs inference. For real networks, the performance of EBS, which uses the concept of integrating different data sources, is found to be superior to the aforementioned inference methods. The best results for EBS are obtained by considering the weights w1=0.2 and w2=0.8 for MCE and BS values, respectively, where approximately 40% of the inferred connections are found to be correct and significantly better than related methods. The results also indicate that expression profile is able to recover some true connections, that are not present in biological annotations, thus leading to the possibility of discovering new relations between its genes.

Journal of Biosciences | 2007

Gene ordering in partitive clustering using microarray expressions.

Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K. Pal

A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering and ordering the genes using gene expression data into homogeneous groups was shown to be useful in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on gene ordering in hierarchical clustering framework for gene expression analysis, there is no work addressing and evaluating the importance of gene ordering in partitive clustering framework, to the best knowledge of the authors. Outside the framework of hierarchical clustering, different gene ordering algorithms are applied on the whole data set, and the domain of partitive clustering is still unexplored with gene ordering approaches. A new hybrid method is proposed for ordering genes in each of the clusters obtained from partitive clustering solution, using microarry gene expressions. Two existing algorithms for optimally ordering cities in travelling salesman problem (TSP), namely, FRAG_GALK and Concorde, are hybridized individually with self organizing MAP to show the importance of gene ordering in partitive clustering framework. We validated our hybrid approach using yeast and fibroblast data and showed that our approach improves the result quality of partitive clustering solution, by identifying subclusters within big clusters, grouping functionally correlated genes within clusters, minimization of summation of gene expression distances, and the maximization of biological gene ordering using MIPS categorization. Moreover, the new hybrid approach, finds comparable or sometimes superior biological gene order in less computation time than those obtained by optimal leaf ordering in hierarchical clustering solution.

IEEE Transactions on Biomedical Engineering | 2009

Combining Multisource Information Through Functional-Annotation-Based Weighting: Gene Function Prediction in Yeast

Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K. Pal

Motivation: One of the important goals of biological investigation is to predict the function of unclassified gene. Although there is a rich literature on multi data source integration for gene function prediction, there is hardly any similar work in the framework of data source weighting using functional annotations of classified genes. In this investigation, we propose a new scoring framework, called biological score (BS) and incorporating data source weighting, for predicting the function of some of the unclassified yeast genes. Methods: The BS is computed by first evaluating the similarities between genes, arising from different data sources, in a common framework, and then integrating them in a linear combination style through weights. The relative weight of each data source is determined adaptively by utilizing the information on yeast gene ontology (GO)-slim process annotations of classified genes, available from Saccharomyces Genome Database (SGD). Genes are clustered by a method called K-BS, where, for each gene, a cluster comprising that gene and its K nearest neighbors is computed using the proposed score (BS). The performances of BS and K-BS are evaluated with gene annotations available from Munich Information Center for Protein Sequences (MIPS). Results: We predict the functional categories of 417 classified genes from 417 clusters with 0.98 positive predictive value using K-BS. The functional categories of 12 unclassified yeast genes are also predicted. Conclusion: Our experimental results indicate that considering multiple data sources and estimating their weights with annotations of classified genes can considerably enhance the performance of BS. It has been found that even a small proportion of annotated genes can provide improvements in finding true positive gene pairs using BS.

IEEE Transactions on Biomedical Engineering | 2012

A Weighted Power Framework for Integrating Multisource Information: Gene Function Prediction in Yeast

Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K. Pal

Predicting the functions of unannotated genes is one of the major challenges of biological investigation. In this study, we propose a weighted power scoring framework, called weighted power biological score (WPBS), for combining different biological data sources and predicting the function of some of the unclassified yeast Saccharomyces cerevisiae genes. The relative power and weight coefficients of different data sources, in the proposed score, are estimated systematically by utilizing functional annotations [yeast Gene Ontology (GO)-Slim: Process] of classified genes, available from Saccharomyces Genome Database. Genes are then clustered by applying k-medoids algorithm on WPBS, and functional categories of 334 unclassified genes are predicted using a P-value cutoff 1 × 10-5. The WPBS is available online at http://www.isical.ac.in/~shubhra/WPBS/WPBS.html, where one can download WPBS, related files, and a MATLAB code to predict functions of unclassified genes.

Explore More