Malay Dutta
Tezpur University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Malay Dutta.
Pattern Recognition Letters | 2005
Malay Dutta; A. Kakoti Mahanta; Arun K. Pujari
The ROCK algorithm is an agglomerative hierarchical clustering algorithm for clustering categorical data [Guha S., Rastogi, R., Shim, K., 1999. ROCK: A robust clustering algorithm for categorical attributes. In: Proc. IEEE Internat. Conf. Data Engineering, Sydney, March 1999]. In this paper we prove that under certain conditions, the final clusters obtained by the algorithm are nothing but the connected components of a certain graph with the input data-points as vertices. We propose a new algorithm QROCK which computes the clusters by determining the connected components of the graph. This leads to a very efficient method of obtaining the clusters giving a drastic reduction of the computing time of the ROCK algorithm. We also justify that it is more practical for specifying the similarity threshold rather than specifying the desired number of clusters a priori. The QROCK algorithm also detects the outliers in this process. We also discuss a new similarity measure for categorical attributes.
Proceedings of the American Mathematical Society | 1976
Malay Dutta; U. B. Tewari
Let T be a multiplier of a Segal algebra S on a locally compact abelian group G. We prove that T~(s) is closed if and only if T is a product of an idempotent and an invertible multiplier. We also show that the techniques developed in the proof of this theorem can be used to obtain some other known results.
Journal of Molecular Evolution | 2012
Siddhartha Sankar Satapathy; Malay Dutta; Alak Kumar Buragohain; Suvendra Kumar Ray
It is generally believed that the effect of translational selection on codon usage bias is related to the number of transfer RNA genes in bacteria, which is more with respect to the high expression genes than the whole genome. Keeping this in the background, we analyzed codon usage bias with respect to asparagine, isoleucine, phenylalanine, and tyrosine amino acids. Analysis was done in seventeen bacteria with the available gene expression data and information about the tRNA gene number. In most of the bacteria, it was observed that codon usage bias and tRNA gene number were not in agreement, which was unexpected. We extended the study further to 199 bacteria, limiting to the codon usage bias in the two highly expressed genes rpoB and rpoC which encode the RNA polymerase subunits β and β′, respectively. In concordance with the result in the high expression genes, codon usage bias in rpoB and rpoC genes was also found to not be in agreement with tRNA gene number in many of these bacteria. Our study indicates that tRNA gene numbers may not be the sole determining factor for translational selection of codon usage bias in bacterial genomes.It is generally believed that the effect of translational selection on codon usage bias is related to the number of transfer RNA genes in bacteria, which is more with respect to the high expression genes than the whole genome. Keeping this in the background, we analyzed codon usage bias with respect to asparagine, isoleucine, phenylalanine, and tyrosine amino acids. Analysis was done in seventeen bacteria with the available gene expression data and information about the tRNA gene number. In most of the bacteria, it was observed that codon usage bias and tRNA gene number were not in agreement, which was unexpected. We extended the study further to 199 bacteria, limiting to the codon usage bias in the two highly expressed genes rpoB and rpoC which encode the RNA polymerase subunits β and β′, respectively. In concordance with the result in the high expression genes, codon usage bias in rpoB and rpoC genes was also found to not be in agreement with tRNA gene number in many of these bacteria. Our study indicates that tRNA gene numbers may not be the sole determining factor for translational selection of codon usage bias in bacterial genomes.
Journal of Molecular Evolution | 2014
Siddhartha Sankar Satapathy; Bhesh Raj Powdel; Malay Dutta; Alak Kumar Buragohain; Suvendra Kumar Ray
The fourfold degenerate site (FDS) in coding sequences is important for studying the effect of any selection pressure on codon usage bias (CUB) because nucleotide substitution per se is not under any such pressure at the site due to the unaltered amino acid sequence in a protein. We estimated the frequency variation of nucleotides at the FDS across the eight family boxes (FBs) defined as Um(g), the unevenness measure of a gene g. The study was made in 545 species of bacteria. In many bacteria, the Um(g) correlated strongly with Nc′—a measure of the CUB. Analysis of the strongly correlated bacteria revealed that the U-ending codons (GGU, CGU) were preferred to the G-ending codons (GGG, CGG) in Gly and Arg FBs even in the genomes with G+C % higher than 65.0. Further evidence suggested that these codons can be used as a good indicator of selection pressure on CUB in genomes with higher G+C %.
international conference on advanced computing | 2008
M Kalita; Dhruba K. Bhattacharyya; Malay Dutta
This paper presents a privacy preserving clustering technique using hybrid approach. The technique mainly exploits a combination of isometric transformations i.e. translation, rotation and reflection transformations along with a secure random function in order to provide secrecy of user-specified attributes without losing accuracy in results. The proposed method was tested and evaluated in terms of several synthetic as well as real-life data and the performance has been found satisfactory in comparison to its other counterparts.
Microbiological Research | 2010
Siddhartha Sankar Satapathy; Malay Dutta; Suvendra Kumar Ray
We have done a comparative study of tRNA diversity and total tRNA genes among different strains of bacteria with respect to the optimum growth temperature of the cells. Our observation suggests that higher tRNA diversity usually occurs in thermophiles in comparison to non-thermophiles. Among psychrophiles total tRNA was observed to be more than two-fold higher than in the non-psychrophiles. Though tRNA diversity and total tRNA have recently been shown to be affected by an organisms genomic GC% and growth rate, this work is the first report on growth temperature affecting these features in bacteria. This work extends the list of molecular features undergoing adaptation due to growth temperature and supports the view that growth temperature acts as a selecting factor at the molecular level during evolution.
Gene | 2014
Siddhartha Sankar Satapathy; Bhes Raj Powdel; Malay Dutta; Alak Kumar Buragohain; Suvendra Kumar Ray
It has been reported earlier that the relative di-nucleotide frequency (RDF) in different parts of a genome is similar while the frequency is variable among different genomes. So RDF is termed as genome signature in bacteria. It is not known if the constancy in RDF is governed by genome wide mutational bias or by selection. Here we did comparative analysis of RDF between the inter-genic and the coding sequences in seventeen bacterial genomes, whose gene expression data was available. The constraint on di-nucleotides was found to be higher in the coding sequences than that in the inter-genic regions and the constraint at the 2nd codon position was more than that in the 3rd position within a genome. Further analysis revealed that the constraint on di-nucleotides at the 2nd codon position is greater in the high expression genes (HEG) than that in the whole genomes as well as in the low expression genes (LEG). We analyzed RDF at the 2nd and the 3rd codon positions in simulated coding sequences that were computationally generated by keeping the codon usage bias (CUB) according to genome G+C composition and the sequence of amino acids unaltered. In the simulated coding sequences, the constraint observed was significantly low and no significant difference was observed between the HEG and the LEG in terms of di-nucleotide constraint. This indicated that the greater constraint on di-nucleotides in the HEG was due to the stronger selection on CUB in these genes in comparison to the LEG within a genome. Further, we did comparative analyses of the RDF in the HEG rpoB and rpoC of 199 bacteria, which revealed a common pattern of constraints on di-nucleotides at the 2nd codon position across these bacteria. To validate the role of CUB on di-nucleotide constraint, we analyzed RDF at the 2nd and the 3rd codon positions in simulated rpoB/rpoC sequences. The analysis revealed that selection on CUB is an important attribute for the constraint on di-nucleotides at these positions in bacterial genomes. We believe that this study has come with major findings of the role of CUB on di-nucleotide constraint in bacterial genomes.
International Journal of Bioinformatics Research and Applications | 2018
Sarmistha Deb; Priyakshi Mahanta; Dhruba K. Bhattacharyya; Malay Dutta
Most of the existing methods in literature have used proximity measures in the construction of co-expression networks (CEN) consisting of functional gene modules. This work describes the construction of co-expression network using mutual information (MI) as a proximity measure with non-linear correlation. The network modules are extracted that are defined over a subset of samples. This method has been tested on several publicly available datasets and the subspace network modules obtained have been validated in terms of both internal and external measures.
ieee international conference on electrical computer and communication technologies | 2015
Mala Dutta; Malay Dutta; Anjana Kakoti Mahanta
In this paper, an incremental method for mining the set of closed intervals in an interval database is presented. In fast-growing data, new intervals are added to an interval database over time. Some earlier methods for mining closed intervals in an interval database assumed the database to be static and hence such methods are not effective for databases whose sizes are incremented over time. Though an incremental method for mining closed intervals has been proposed earlier, the incremental method presented in this paper for the same problem is more time-efficient than the previous method. The method proposed in this paper takes only O(n) time to update the set of closed intervals in an interval database containing n intervals after a new interval is added to it, as compared to O(n2) time taken by the earlier incremental method. The method proposed in this paper has been tested on real-life and synthetic data and the results are reported.
national conference computational intelligence | 2012
Malay Dutta
Summary form only given. An overview of various kinds of optimization problems will be given with examples in the world of applications. Some classical methods of solving such problems, for example, greedy algorithms, dynamic programming, method of steepest descent etc will be mentioned. The notion of polynomial-time algorithms and their importance will be explained. Some examples of optimization problems for which no polynomial-time algorithms are expected to exist, and hence considered intractable, will be given. Various techniques of handling such problems to get rough, workable solutions will be discussed. Finally progress in resolving the million dollars question whether these intractable problems are really intractable (in brief the P Vs NP question) will be mentioned briefly.
Collaboration
Dive into the Malay Dutta's collaboration.
North Eastern Regional Institute of Science and Technology
View shared research outputsNorth Eastern Regional Institute of Science and Technology
View shared research outputsNorth Eastern Regional Institute of Science and Technology
View shared research outputs