Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dukka B. Kc is active.

Publication


Featured researches published by Dukka B. Kc.


BioMed Research International | 2016

RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest

H. Ismail; Ahoi Jones; Jung H. Kim; Robert H. Newman; Dukka B. Kc

Protein phosphorylation is one of the most widespread regulatory mechanisms in eukaryotes. Over the past decade, phosphorylation site prediction has emerged as an important problem in the field of bioinformatics. Here, we report a new method, termed Random Forest-based Phosphosite predictor 2.0 (RF-Phos 2.0), to predict phosphorylation sites given only the primary amino acid sequence of a protein as input. RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos 2.0 compares favorably to other popular mammalian phosphosite prediction methods, such as PhosphoSVM, GPS2.1, and Musite.


GigaScience | 2015

The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches

Ishita K. Khan; Qing Wei; Samuel Chapman; Dukka B. Kc; Daisuke Kihara

BackgroundFunctional annotation of novel proteins is one of the central problems in bioinformatics. With the ever-increasing development of genome sequencing technologies, more and more sequence information is becoming available to analyze and annotate. To achieve fast and automatic function annotation, many computational (automated) function prediction (AFP) methods have been developed. To objectively evaluate the performance of such methods on a large scale, community-wide assessment experiments have been conducted. The second round of the Critical Assessment of Function Annotation (CAFA) experiment was held in 2013–2014. Evaluation of participating groups was reported in a special interest group meeting at the Intelligent Systems in Molecular Biology (ISMB) conference in Boston in 2014. Our group participated in both CAFA1 and CAFA2 using multiple, in-house AFP methods. Here, we report benchmark results of our methods obtained in the course of preparation for CAFA2 prior to submitting function predictions for CAFA2 targets.ResultsFor CAFA2, we updated the annotation databases used by our methods, protein function prediction (PFP) and extended similarity group (ESG), and benchmarked their function prediction performances using the original (older) and updated databases. Performance evaluation for PFP with different settings and ESG are discussed. We also developed two ensemble methods that combine function predictions from six independent, sequence-based AFP methods. We further analyzed the performances of our prediction methods by enriching the predictions with prior distribution of gene ontology (GO) terms. Examples of predictions by the ensemble methods are discussed.ConclusionsUpdating the annotation database was successful, improving the Fmax prediction accuracy score for both PFP and ESG. Adding the prior distribution of GO terms did not make much improvement. Both of the ensemble methods we developed improved the average Fmax score over all individual component methods except for ESG. Our benchmark results will not only complement the overall assessment that will be done by the CAFA organizers, but also help elucidate the predictive powers of sequence-based function prediction methods in general.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

Topology Improves Phylogenetic Motif Functional Site Predictions

Dukka B. Kc; Dennis R. Livesay

Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that of the complete phylogeny. Herein, we bypass the phylogenetic reconstruction step and identify PMs directly from distance matrix comparisons. In order to optimize the new algorithm, we consider three different distance matrices and 13 different matrix similarity scores. We assess the performance of the various approaches on a structurally nonredundant data set that includes three types of functional site definitions. Without exception, the predictive power of the original approach outperforms the distance matrix variants. While the distance matrix methods fail to improve upon the original approach, our results are important because they clearly demonstrate that the improved predictive power is based on the topological comparisons. Meaning that phylogenetic trees are a straightforward, yet powerful way to improve functional site prediction accuracy. While complementary studies have shown that topology improves predictions of protein-protein interactions, this report represents the first demonstration that trees improve functional site predictions as well.


genetic and evolutionary computation conference | 2014

Time-series forecasting with evolvable partially connected artificial neural network

Mohammad Gorji Sefidmazgi; Abdollah Homaifar; Dukka B. Kc; Anthony Guiseppi-Elie

In nonlinear and chaotic time series prediction, constructing the mathematical model of the system dynamics is not an easy task. Partially connected Artificial Neural Network with Evolvable Topology (PANNET) is a new paradigm for prediction of chaotic time series without access to the dynamics and essential memory depth of the system. Evolvable topology of the PANNET provides flexibility in recognition of systems in contrast to fixed layered topology of the traditional ANNs. This evolvable topology guides the relationship between observation nodes and hidden nodes, where hidden nodes are extra nodes that play the role of memory or internal states of the system. In the proposed variable-length Genetic Algorithm (GA), internal neurons can be connected arbitrarily to any type of nodes. Besides, number of neurons, inputs and outputs for each neuron, origin and weight of each connection evolve in order to find the best configuration of the network.


Briefings in Bioinformatics | 2016

Recent advances in sequence-based protein structure prediction

Dukka B. Kc

The most accurate characterizations of the structure of proteins are provided by structural biology experiments. However, because of the high cost and labor-intensive nature of the structural experiments, the gap between the number of protein sequences and solved structures is widening rapidly. Development of computational methods to accurately model protein structures from sequences is becoming increasingly important to the biological community. In this article, we highlight some important progress in the field of protein structure prediction, especially those related to free modeling (FM) methods that generate structure models without using homologous templates. We also provide a short synopsis of some of the recent advances in FM approaches as demonstrated in the recent Computational Assessment of Structure Prediction competition as well as recent trends and outlook for FM approaches in protein structure prediction.The most accurate characterizations of the structure of proteins are provided by structural biology experiments. However, because of the high cost and labor-intensive nature of the structural experiments, the gap between the number of protein sequences and solved structures is widening rapidly. Development of computational methods to accurately model protein structures from sequences is becoming increasingly important to the biological community. In this article, we highlight some important progress in the field of protein structure prediction, especially those related to free modeling (FM) methods that generate structure models without using homologous templates. We also provide a short synopsis of some of the recent advances in FM approaches as demonstrated in the recent Computational Assessment of Structure Prediction competition as well as recent trends and outlook for FM approaches in protein structure prediction.


PeerJ | 2017

The evolution of logic circuits for the purpose of protein contact map prediction

Samuel Chapman; Christoph Adami; Claus O. Wilke; Dukka B. Kc

Predicting protein structure from sequence remains a major open problem in protein biochemistry. One component of predicting complete structures is the prediction of inter-residue contact patterns (contact maps). Here, we discuss protein contact map prediction by machine learning. We describe a novel method for contact map prediction that uses the evolution of logic circuits. These logic circuits operate on feature data and output whether or not two amino acids in a protein are in contact or not. We show that such a method is feasible, and in addition that evolution allows the logic circuits to be trained on the dataset in an unbiased manner so that it can be used in both contact map prediction and the selection of relevant features in a dataset.


BMC Bioinformatics | 2017

CNN-BLPred: A Convolutional neural network based predictor for β-Lactamases (BL) and their classes

Clarence White; H. Ismail; Hiroto Saigo; Dukka B. Kc

BackgroundThe β-Lactamase (BL) enzyme family is an important class of enzymes that plays a key role in bacterial resistance to antibiotics. As the newly identified number of BL enzymes is increasing daily, it is imperative to develop a computational tool to classify the newly identified BL enzymes into one of its classes. There are two types of classification of BL enzymes: Molecular Classification and Functional Classification. Existing computational methods only address Molecular Classification and the performance of these existing methods is unsatisfactory.ResultsWe addressed the unsatisfactory performance of the existing methods by implementing a Deep Learning approach called Convolutional Neural Network (CNN). We developed CNN-BLPred, an approach for the classification of BL proteins. The CNN-BLPred uses Gradient Boosted Feature Selection (GBFS) in order to select the ideal feature set for each BL classification. Based on the rigorous benchmarking of CCN-BLPred using both leave-one-out cross-validation and independent test sets, CCN-BLPred performed better than the other existing algorithms.Compared with other architectures of CNN, Recurrent Neural Network, and Random Forest, the simple CNN architecture with only one convolutional layer performs the best. After feature extraction, we were able to remove ~95% of the 10,912 features using Gradient Boosted Trees. During 10-fold cross validation, we increased the accuracy of the classic BL predictions by 7%. We also increased the accuracy of Class A, Class B, Class C, and Class D performance by an average of 25.64%. The independent test results followed a similar trend.ConclusionsWe implemented a deep learning algorithm known as Convolutional Neural Network (CNN) to develop a classifier for BL classification. Combined with feature selection on an exhaustive feature set and using balancing method such as Random Oversampling (ROS), Random Undersampling (RUS) and Synthetic Minority Oversampling Technique (SMOTE), CNN-BLPred performs significantly better than existing algorithms for BL classification.


Archive | 2011

Predicting Protein Functional Sites with Phylogenetic Motifs: Past, Present and Beyond

Dennis R. Livesay; Dukka B. Kc; David La

More than sequence or structure, function imposes very tight constraints on the evolutionary variability within a protein family. As such, numerous functional site prediction methods are based on algorithms to uncover conserved regions that lead to conserved function. Nevertheless, evolution does allow for some systematic variability within functional regions. Based on this tenet, we have introduced the MINER algorithm to predict functional regions from phylogenetic motifs. Specifically, our approach identifies alignment fragments that parallel the overall phylogeny of the family, which are more likely to be functional due to increased evolutionary signature. In this chapter, we provide an overview of the method, summarize recent developments, and comment on future work.


BioMed Research International | 2016

Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains.

Ashwani Jha; K. M. Flurchick; Marwan Bikdash; Dukka B. Kc

Internally symmetric proteins are proteins that have a symmetrical structure in their monomeric single-chain form. Around 10–15% of the protein domains can be regarded as having some sort of internal symmetry. In this regard, we previously published SymD (symmetry detection), an algorithm that determines whether a given protein structure has internal symmetry by attempting to align the protein to its own copy after the copy is circularly permuted by all possible numbers of residues. SymD has proven to be a useful algorithm to detect symmetry. In this paper, we present a new parallelized algorithm called Parallel-SymD for detecting symmetry of proteins on clusters of computers. The achieved speedup of the new Parallel-SymD algorithm scales well with the number of computing processors. Scaling is better for proteins with a larger number of residues. For a protein of 509 residues, a speedup of 63 was achieved on a parallel system with 100 processors.


Molecular BioSystems | 2016

RF-Hydroxysite: a random forest based predictor for hydroxylation sites

H. Ismail; Robert H. Newman; Dukka B. Kc

Collaboration


Dive into the Dukka B. Kc's collaboration.

Top Co-Authors

Avatar

H. Ismail

North Carolina Agricultural and Technical State University

View shared research outputs
Top Co-Authors

Avatar

Robert H. Newman

North Carolina Agricultural and Technical State University

View shared research outputs
Top Co-Authors

Avatar

Dennis R. Livesay

University of North Carolina at Charlotte

View shared research outputs
Top Co-Authors

Avatar

Samuel Chapman

Michigan State University

View shared research outputs
Top Co-Authors

Avatar

Abdollah Homaifar

North Carolina Agricultural and Technical State University

View shared research outputs
Top Co-Authors

Avatar

Ahoi Jones

North Carolina Agricultural and Technical State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ashwani Jha

North Carolina Agricultural and Technical State University

View shared research outputs
Top Co-Authors

Avatar

Christoph Adami

Michigan State University

View shared research outputs
Top Co-Authors

Avatar

Claus O. Wilke

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge