Suneetha Uppu
Curtin University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Suneetha Uppu.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2018
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. This paper reviews the current methods and the related software packages to detect the SNP interactions that contribute to diseases. The issues that need to be considered when developing these models are addressed in this review. The paper also reviews the achievements in data simulation to evaluate the performance of these models. Further, it discusses the future of SNP interaction analysis.
international conference on neural information processing | 2016
Suneetha Uppu; Aneesh Krishna
Revealing the underlying complex architecture of human diseases has received considerable attention since the exploration of genotype-phenotype relationships in genetic epidemiology. Identification of these relationships becomes more challenging due to multiple factors acting together or independently. A deep neural network was trained in the previous work to identify two-locus interacting single nucleotide polymorphisms (SNPs) related to a complex disease. The model was assessed for all two-locus combinations under various simulated scenarios. The results showed significant improvements in predicting SNP-SNP interactions over the existing conventional machine learning techniques. Furthermore, the findings are confirmed on a published dataset. However, the performance of the proposed method in the higher-order interactions was unknown. The objective of this study is to validate the model for the higher-order interactions in high-dimensional data. The proposed method is further extended for unsupervised learning. A number of experiments were performed on the simulated datasets under same scenarios as well as a real dataset to show the performance of the extended model. On an average, the results illustrate improved performance over the previous methods. The model is further evaluated on a sporadic breast cancer dataset to identify higher-order interactions between SNPs. The results rank top 20 higher-order SNP interactions responsible for sporadic breast cancer.
Journal of Software | 2016
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan
The susceptibility of complex diseases are characterised by numerous genetic, lifestyle, and environmental causes individually or due to their interaction effects. The recent explosion in detecting genetic interacting factors is increasingly revealing the underlying biological networks behind complex diseases. Several computational methods are explored to discover interacting polymorphisms among unlinked loci. However, there has been no significant breakthrough towards solving this problem because of biomolecular complexities and computational limitations. Our previous research trained a deep multilayered feedforward neural network to predict two-locus polymorphisms due to interactions in genome-wide data. The performance of the method was studied on numerous simulated datasets and a published genomewide dataset. In this manuscript, the performance of the trained multilayer neural network is validated by varying the parameters of the models under various scenarios. Furthermore, the observations of the previous method are confirmed in this study by evaluating on a real dataset. The experimental findings on a real dataset show significant rise in the prediction accuracy over other conventional techniques. The result shows highly ranked interacting two-locus polymorphisms, which may be associated with susceptibility for the development of breast cancer.
Network Modeling Analysis in Health Informatics and BioInformatics | 2015
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan
AbstractThe advancements in sequencing high-throughput human genome and computational abilities have tremendously improved the understanding of the genetic architecture behind the complex diseases. The development of high-throughput genotyping and next-generation sequencing technologies enables large-scale data for genetic epidemiological analysis. These advances led to the identification of a number of single nucleotide polymorphisms (SNPs) associated with complex diseases. The interactions between SNPs responsible for disease susceptibility have been increasingly explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. The goal of this research is to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for single-locus models to six-locus models using simulated data. The datasets were generated for five different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 57,300 datasets were generated and several experiments conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the existing approaches. The experimental results demonstrated significant improvements in accuracy for detecting interactions associated with the phenotype. Further, the approach was successfully applied over sporadic breast cancer data. The results show interaction among six polymorphisms, which included five different estrogen-metabolism genes.
international conference on neural information processing | 2015
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan
Identification and characterization of interactions between genes have been increasingly explored in current Genome-wide association studies (GWAS). Several machine learning and data mining approaches have been proposed to identify the multi-locus interactions in higher order genomic data. However, detecting these interactions is challenging due to bio-molecular complexities and computational limitations. In this paper, a multifactor dimensionality reduction based associative classifier is proposed for detecting SNP interactions in genetic epidemiological studies. The approach is evaluated for one to six loci models by varying heritability, minor allele frequency, case-control ratios and sample size. The experimental results demonstrated significant improvements in accuracy for detecting interacting single nucleotide polymorphisms (SNPs) responsible for complex diseases when compared to the previous approaches. Further, the approach was successfully evaluated by using sporadic breast cancer data. The results show interactions among five polymorphisms in three different estrogen-metabolism genes.
bioinformatics and bioengineering | 2014
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan
There have been many studies that depict genotype-phenotype relationships by identifying genetic variants associated with a specific disease. Researchers focus more attention on interactions between SNPs that are strongly associated with disease in the absence of main effect. In this context, a number of machine learning and data mining tools are applied to identify the combinations of multi-locus SNPs in higher order data. However, none of the current models can identify useful SNP-SNP interactions for high dimensional genome data. Detecting these interactions is challenging due to bio-molecular complexities and computational limitations. The goal of this research was to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for two locus epistasis interactions using simulated data. The datasets were generated for 5 different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 23,400 datasets were generated and several experiments are conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the previous approaches. Though associative classification showed only relatively small improvement in accuracy for balanced datasets, it outperformed existing approaches in higher order multi-locus interactions in imbalanced datasets.
international conference on neural information processing | 2017
Suneetha Uppu; Aneesh Krishna
In genetic epidemiology, epistasis has been the subject of several researchers to understand the underlying causes of complex diseases. Identifying gene-gene and/or gene-environmental interactions are becoming more challenging due to multiple genetic and environmental factors acting together or independently. The limitations of current computational approaches motivated the development of a deep learning method in our recent study. The approach trained a multilayered feedforward neural network to discover interacting genes associated with complex diseases. The models are evaluated under various simulated scenarios and compared with the previous methods. The results showed significant improvements in predicting gene interactions over the traditional machine learning techniques. This study is further extended to maximize the predictive performance of the method by tuning the hyperparameters using Cartesian grid and random grid searching. Several experiments are conducted on real datasets to identify higher-order interacting genes responsible for diseases. The findings demonstrated randomly chosen trials are more efficient than trials chosen by grid search for optimizing hyperparameters. The optimal configuration of hyperparameter values improved the model performance without overfitting. The results illustrate top 30 gene interactions responsible for sporadic breast cancer and hypertension.
Network Modeling Analysis in Health Informatics and BioInformatics | 2016
Suneetha Uppu; Aneesh Krishna
The advancements in genetic epidemiology have focused more on understanding the associations and functional relationships among the genes. Identifying the susceptible genes and their interaction effects over the complex traits remains statistically and computationally challenging. An associative classification-based multifactor dimensionality reduction method (MDRAC) was proposed to improve the identification of multi-locus interacting genes associated with a disease. The method was evaluated for one to six loci by varying heritability, minor allele frequency, case–control ratios, and sample size. The experimental results demonstrated significant improvements in the accuracy over the previous methods. However, the performance of MDRAC in the presence of noise due to genotyping error, missing data, phenocopy, and genetic heterogeneity is unknown. The goal of this study is to evaluate MDRAC for identifying single nucleotide polymorphism interactions in the presence of noise. Several experiments are conducted on simulated datasets and on a published dataset to demonstrate the performance of MDRAC. On average, the results showed improved performance over the previous MDR method in all the models. However, the performance of MDRAC is reduced in the presence of phenocopy and genetic heterogeneity, or their combinations with other sources of noise.
International Journal of Medical Informatics | 2018
Suneetha Uppu; Aneesh Krishna
Identifying genetic variants associated with complex diseases is a central focus of genome-wide association studies. These studies extensively adopt univariate analysis by ignoring interaction effects. It is widely accepted that the etiology of most complex diseases depends on interactions between genetic variants and / or environmental factors. Several machine learning and data mining methods have been consistently successful in exposing these interaction effects. However, there has been no major breakthrough due to various biological complexities, and statistical computational challenges facing in the field of genetic epidemiology, despite of many efforts. Deep learning is emerging machine learning approach that promises to reveal the hidden patterns of big data for accurate predictions. In this study, a deep neural network is unified with a random forest by forming hybrid architecture, for achieving reliable detection of multi-locus interactions between single nucleotide polymorphisms. The proposed hybrid method is evaluated on various simulated scenarios in the absence of main effect for six epistasis models. The best model with optimal hyper-parameters (grid and random grid search) is chosen to enhance the power of the method by maximising the models prediction accuracy. The performance metrics of each model is analysed for both training and validation. Further, the performance of the method in the presence of noise due to missing data, genotyping errors, genetic heterogeneity, and phenocopy, and their combined effects are evaluated. The power of the method in detecting two-locus interactions is compared with the previous methods in the presence and absence of noise. On an average, the power of the proposed method is much higher than the previous methods for all simulated scenarios. Finally, findings are confirmed on a chronical dialysis patients data, obtained from the published study performed at the Kaohsiung Chang Gung Memorial Hospital. It is observed that the interaction between SNP 21 (2) and SNP 28 (2) in the mitochondrial D-loop has the highest risk for the disease manifestation.
international conference on intelligent information processing | 2014
Suneetha Uppu; Aneesh Krishna; Raj P. Gopalan