Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tanmaya Kumar Sahu is active.

Publication


Featured researches published by Tanmaya Kumar Sahu.


Scientific Reports | 2017

Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC

Prabina Kumar Meher; Tanmaya Kumar Sahu; Varsha Saini; A. R. Rao

Antimicrobial peptides (AMPs) are important components of the innate immune system that have been found to be effective against disease causing pathogens. Identification of AMPs through wet-lab experiment is expensive. Therefore, development of efficient computational tool is essential to identify the best candidate AMP prior to the in vitro experimentation. In this study, we made an attempt to develop a support vector machine (SVM) based computational approach for prediction of AMPs with improved accuracy. Initially, compositional, physico-chemical and structural features of the peptides were generated that were subsequently used as input in SVM for prediction of AMPs. The proposed approach achieved higher accuracy than several existing approaches, while compared using benchmark dataset. Based on the proposed approach, an online prediction server iAMPpred has also been developed to help the scientific community in predicting AMPs, which is freely accessible at http://cabgrid.res.in:8080/amppred/. The proposed approach is believed to supplement the tools and techniques that have been developed in the past for prediction of AMPs.


BMC Bioinformatics | 2014

A statistical approach for 5′ splice site prediction using short sequence motifs and without encoding sequence data

Prabina Kumar Meher; Tanmaya Kumar Sahu; A. R. Rao; S. D. Wahi

BackgroundMost of the approaches for splice site prediction are based on machine learning techniques. Though, these approaches provide high prediction accuracy, the window lengths used are longer in size. Hence, these approaches may not be suitable to predict the novel splice variants using the short sequence reads generated from next generation sequencing technologies. Further, machine learning techniques require numerically encoded data and produce different accuracy with different encoding procedures. Therefore, splice site prediction with short sequence motifs and without encoding sequence data became a motivation for the present study.ResultsAn approach for finding association among nucleotide bases in the splice site motifs is developed and used further to determine the appropriate window size. Besides, an approach for prediction of donor splice sites using sum of absolute error criterion has also been proposed. The proposed approach has been compared with commonly used approaches i.e., Maximum Entropy Modeling (MEM), Maximal Dependency Decomposition (MDD), Weighted Matrix Method (WMM) and Markov Model of first order (MM1) and was found to perform equally with MEM and MDD and better than WMM and MM1 in terms of prediction accuracy.ConclusionsThe proposed prediction approach can be used in the prediction of donor splice sites with higher accuracy using short sequence motifs and hence can be used as a complementary method to the existing approaches. Based on the proposed methodology, a web server was also developed for easy prediction of donor splice sites by users and is available at http://cabgrid.res.in:8080/sspred.


Biodata Mining | 2016

Prediction of donor splice sites using random forest with a new sequence encoding approach

Prabina Kumar Meher; Tanmaya Kumar Sahu; A. R. Rao

BackgroundDetection of splice sites plays a key role for predicting the gene structure and thus development of efficient analytical methods for splice site prediction is vital. This paper presents a novel sequence encoding approach based on the adjacent di-nucleotide dependencies in which the donor splice site motifs are encoded into numeric vectors. The encoded vectors are then used as input in Random Forest (RF), Support Vector Machines (SVM) and Artificial Neural Network (ANN), Bagging, Boosting, Logistic regression, kNN and Naïve Bayes classifiers for prediction of donor splice sites.ResultsThe performance of the proposed approach is evaluated on the donor splice site sequence data of Homo sapiens, collected from Homo Sapiens Splice Sites Dataset (HS3D). The results showed that RF outperformed all the considered classifiers. Besides, RF achieved higher prediction accuracy than the existing methods viz., MEM, MDD, WMM, MM1, NNSplice and SpliceView, while compared using an independent test dataset.ConclusionBased on the proposed approach, we have developed an online prediction server (MaLDoSS) to help the biological community in predicting the donor splice sites. The server is made freely available at http://cabgrid.res.in:8080/maldoss. Due to computational feasibility and high prediction accuracy, the proposed approach is believed to help in predicting the eukaryotic gene structure.


Gene | 2016

Identification of species based on DNA barcode using k-mer feature vector and Random forest classifier.

Prabina Kumar Meher; Tanmaya Kumar Sahu; A. R. Rao

DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists.


Bioinformation | 2012

shRNAPred (version 1.0): An open source and standalone software for short hairpin RNA (shRNA) prediction

Nishtha Singh; Tanmaya Kumar Sahu; A. R. Rao; T. Mohapatra

The small hairpin RNAs (shRNA) are useful in many ways like identification of trait specific molecular markers, gene silencing and characterization of a species. In public domain, hardly there exists any standalone software for shRNA prediction. Hence, a software shRNAPred (1.0) is proposed here to offer a user-friendly Command-line User Interface (CUI) to predict ‘shRNA-like’ regions from a large set of nucleotide sequences. The software is developed using PERL Version 5.12.5 taking into account the parameters such as stem and loop length combinations, specific loop sequence, GC content, melting temperature, position specific nucleotides, low complexity filter, etc. Each of the parameters is assigned with a specific score and based on which the software ranks the predicted shRNAs. The high scored shRNAs obtained from the software are depicted as potential shRNAs and provided to the user in the form of a text file. The proposed software also allows the user to customize certain parameters while predicting specific shRNAs of his interest. The shRNAPred (1.0) is open access software available for academic users. It can be downloaded freely along with user manual, example dataset and output for easy understanding and implementation. Availability The database is available for free at http://bioinformatics.iasri.res.in/EDA/downloads/shRNAPred_v1.0.exe


BMC Bioinformatics | 2017

DIRProt: a computational approach for discriminating insecticide resistant proteins from non-resistant proteins

Prabina Kumar Meher; Tanmaya Kumar Sahu; Anjali Banchariya; A. R. Rao

BackgroundInsecticide resistance is a major challenge for the control program of insect pests in the fields of crop protection, human and animal health etc. Resistance to different insecticides is conferred by the proteins encoded from certain class of genes of the insects. To distinguish the insecticide resistant proteins from non-resistant proteins, no computational tool is available till date. Thus, development of such a computational tool will be helpful in predicting the insecticide resistant proteins, which can be targeted for developing appropriate insecticides.ResultsFive different sets of feature viz., amino acid composition (AAC), di-peptide composition (DPC), pseudo amino acid composition (PAAC), composition-transition-distribution (CTD) and auto-correlation function (ACF) were used to map the protein sequences into numeric feature vectors. The encoded numeric vectors were then used as input in support vector machine (SVM) for classification of insecticide resistant and non-resistant proteins. Higher accuracies were obtained under RBF kernel than that of other kernels. Further, accuracies were observed to be higher for DPC feature set as compared to others. The proposed approach achieved an overall accuracy of >90% in discriminating resistant from non-resistant proteins. Further, the two classes of resistant proteins i.e., detoxification-based and target-based were discriminated from non-resistant proteins with >95% accuracy. Besides, >95% accuracy was also observed for discrimination of proteins involved in detoxification- and target-based resistance mechanisms. The proposed approach not only outperformed Blastp, PSI-Blast and Delta-Blast algorithms, but also achieved >92% accuracy while assessed using an independent dataset of 75 insecticide resistant proteins.ConclusionsThis paper presents the first computational approach for discriminating the insecticide resistant proteins from non-resistant proteins. Based on the proposed approach, an online prediction server DIRProt has also been developed for computational prediction of insecticide resistant proteins, which is accessible at http://cabgrid.res.in:8080/dirprot/. The proposed approach is believed to supplement the efforts needed to develop dynamic insecticides in wet-lab by targeting the insecticide resistant proteins.


Journal of Theoretical Biology | 2016

A computational approach for prediction of donor splice sites with improved accuracy.

Prabina Kumar Meher; Tanmaya Kumar Sahu; A. R. Rao; S. D. Wahi

Identification of splice sites is important due to their key role in predicting the exon-intron structure of protein coding genes. Though several approaches have been developed for the prediction of splice sites, further improvement in the prediction accuracy will help predict gene structure more accurately. This paper presents a computational approach for prediction of donor splice sites with higher accuracy. In this approach, true and false splice sites were first encoded into numeric vectors and then used as input in artificial neural network (ANN), support vector machine (SVM) and random forest (RF) for prediction. ANN and SVM were found to perform equally and better than RF, while tested on HS3D and NN269 datasets. Further, the performance of ANN, SVM and RF were analyzed by using an independent test set of 50 genes and found that the prediction accuracy of ANN was higher than that of SVM and RF. All the predictors achieved higher accuracy while compared with the existing methods like NNsplice, MEM, MDD, WMM, MM1, FSPLICE, GeneID and ASSP, using the independent test set. We have also developed an online prediction server (PreDOSS) available at http://cabgrid.res.in:8080/predoss, for prediction of donor splice sites using the proposed approach.


Journal of Biomolecular Structure & Dynamics | 2018

In silico site-directed mutagenesis of neutralizing mAb 4C4 and analysis of its interaction with G-H loop of VP1 to explore its therapeutic applications against FMD

Tanmaya Kumar Sahu; Dibyabhaba Pradhan; A. R. Rao; Lingaraj Jena

Abstract Investigating the behaviour of bio-molecules through computational mutagenesis is gaining interest to facilitate the development of new therapeutic solutions for infectious diseases. The antigenetically variant genotypes of foot and mouth disease virus (FMDV) and their subsequent infections are challenging to tackle with traditional vaccination. In such scenario, neutralizing antibodies might provide an alternate solution to manage the FMDV infection. Thus, we have analysed the interaction of the mAb 4C4 with a synthetic G-H loop of FMDV-VP1 through in silico mutagenesis and molecular modelling. Initially, a set of 25,434 mutants were designed and the mutants having better energetic stability than 4C4 were clustered based on sequence identity. The best mutant representing each cluster was selected and evaluated for its binding affinity with the antigen in terms of docking scores, interaction energy and binding energy. Six mutants have confirmed better binding affinities towards the antigen than 4C4. Further, interaction of these mutants with the natural G-H loop that is bound to mAb SD6 was also evaluated. One 4C4 variant having mutations at the positions 2034(N→L), 2096(N→C), 2098(D→Y), 2532(T→K) and 2599(A→G) has revealed better binding affinities towards both the synthetic and natural G-H loops than 4C4 and SD6, respectively. A molecular dynamic simulation for 50 ns was conducted for mutant and wild-type antibody structures which supported the pre-simulation results. Therefore, these mutations on mAb 4C4 are believed to provide a better antibody-based therapeutic option for FMD. Communicated by Ramaswamy H. Sarma


Frontiers in Microbiology | 2018

nifPred: Proteome-Wide Identification and Categorization of Nitrogen-Fixation Proteins of Diaztrophs Based on Composition-Transition-Distribution Features Using Support Vector Machine

Prabina Kumar Meher; Tanmaya Kumar Sahu; Jyotilipsa Mohanty; Shachi Gahoi; Supriya Purru; Monendra Grover; A. R. Rao

As inorganic nitrogen compounds are essential for basic building blocks of life (e.g., nucleotides and amino acids), the role of biological nitrogen-fixation (BNF) is indispensible. All nitrogen fixing microbes rely on the same nitrogenase enzyme for nitrogen reduction, which is in fact an enzyme complex consists of as many as 20 genes. However, the occurrence of six genes viz., nifB, nifD, nifE, nifH, nifK, and nifN has been proposed to be essential for a functional nitrogenase enzyme. Therefore, identification of these genes is important to understand the mechanism of BNF as well as to explore the possibilities for improving BNF from agricultural sustainability point of view. Further, though the computational tools are available for the annotation and phylogenetic analysis of nifH gene sequences alone, to the best of our knowledge no tool is available for the computational prediction of the above mentioned six categories of nitrogen-fixation (nif) genes or proteins. Thus, we proposed an approach, which is first of its kind for the computational identification of nif proteins encoded by the six categories of nif genes. Sequence-derived features were employed to map the input sequences into vectors of numeric observations that were subsequently fed to the support vector machine as input. Two types of classifier were constructed: (i) a binary classifier for classification of nif and non-nitrogen-fixation (non-nif) proteins, and (ii) a multi-class classifier for classification of six categories of nif proteins. Higher accuracies were observed for the combination of composition-transition-distribution (CTD) feature set and radial kernel, as compared to the other feature-kernel combinations. The overall accuracies were observed >90% in both binary and multi-class classifications. The developed approach further achieved >92% accuracy, while evaluated with blind (independent) test datasets. The developed approach also produced higher accuracy in identifying nif proteins, while evaluated using proteome-wide datasets of several species. Furthermore, we established a prediction server nifPred (http://webapp.cabgrid.res.in/nifPred) to assist the scientific community for proteome-wide identification of six categories of nif proteins. Besides, the source code of nifPred is also available at https://github.com/PrabinaMeher/nifPred. The developed web server is expected to supplement the transcriptional profiling and comparative genomics studies for the identification and functional annotation of genes related to BNF.


Frontiers in Genetics | 2018

ir-HSP: Improved Recognition of Heat Shock Proteins, Their Families and Sub-types Based On g-Spaced Di-peptide Features and Support Vector Machine

Prabina Kumar Meher; Tanmaya Kumar Sahu; Shachi Gahoi; A. R. Rao

Heat shock proteins (HSPs) play a pivotal role in cell growth and variability. Since conventional approaches are expensive and voluminous protein sequence information is available in the post-genomic era, development of an automated and accurate computational tool is highly desirable for prediction of HSPs, their families and sub-types. Thus, we propose a computational approach for reliable prediction of all these components in a single framework and with higher accuracy as well. The proposed approach achieved an overall accuracy of ~84% in predicting HSPs, ~97% in predicting six different families of HSPs, and ~94% in predicting four types of DnaJ proteins, with bench mark datasets. The developed approach also achieved higher accuracy as compared to most of the existing approaches. For easy prediction of HSPs by experimental scientists, a user friendly web server ir-HSP is made freely accessible at http://cabgrid.res.in:8080/ir-hsp. The ir-HSP was further evaluated for proteome-wide identification of HSPs by using proteome datasets of eight different species, and ~50% of the predicted HSPs in each species were found to be annotated with InterPro HSP families/domains. Thus, the developed computational method is expected to supplement the currently available approaches for prediction of HSPs, to the extent of their families and sub-types.

Collaboration


Dive into the Tanmaya Kumar Sahu's collaboration.

Top Co-Authors

Avatar

A. R. Rao

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

Prabina Kumar Meher

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

S. D. Wahi

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

Bijay Kumar Behera

Indian Council of Agricultural Research

View shared research outputs
Top Co-Authors

Avatar

Nishtha Singh

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

A. P. Sharma

Indian Council of Agricultural Research

View shared research outputs
Top Co-Authors

Avatar

Manoswini Dash

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

Shachi Gahoi

Indian Agricultural Statistics Research Institute

View shared research outputs
Top Co-Authors

Avatar

T. Mohapatra

Indian Council of Agricultural Research

View shared research outputs
Top Co-Authors

Avatar

Anil Rai

Indian Agricultural Statistics Research Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge