Binay Panda
Strand Life Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Binay Panda.
BMC Genomics | 2012
Neeraja M. Krishnan; Swetansu Pattnaik; Prachi Jain; Prakhar Gaur; Rakshit Choudhary; Srividya Vaidyanathan; Sa Deepak; Arun K. Hariharan; Pg Bharath Krishna; Jayalakshmi Nair; Linu Varghese; Naveen K Valivarthi; Kunal Dhas; Krishna Ramaswamy; Binay Panda
BackgroundThe Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach.ResultsThe genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs.ConclusionsThis study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.
Journal of Medical Genetics | 2016
Jamie M Ellingford; Stephanie Barton; Sanjeev Bhaskar; James O'Sullivan; Simon G Williams; Janine A. Lamb; Binay Panda; Panagiotis I. Sergouniotis; Rachel L. Gillespie; Stephen P. Daiger; Georgina Hall; Theodora Gale; I. Christopher Lloyd; Paul N. Bishop; Simon C. Ramsden; Graeme C.M. Black
Background Inherited retinal diseases (IRDs) are a clinically and genetically heterogeneous set of disorders, for which diagnostic second-generation sequencing (next-generation sequencing, NGS) services have been developed worldwide. Methods We present the molecular findings of 537 individuals referred to a 105-gene diagnostic NGS test for IRDs. We assess the diagnostic yield, the spectrum of clinical referrals, the variant analysis burden and the genetic heterogeneity of IRD. We retrospectively analyse disease-causing variants, including an assessment of variant frequency in Exome Aggregation Consortium (ExAC). Results Individuals were referred from 10 clinically distinct classifications of IRD. Of the 4542 variants clinically analysed, we have reported 402 mutations as a cause or a potential cause of disease in 62 of the 105 genes surveyed. These variants account or likely account for the clinical diagnosis of IRD in 51% of the 537 referred individuals. 144 potentially disease-causing mutations were identified as novel at the time of clinical analysis, and we further demonstrate the segregation of known disease-causing variants among individuals with IRD. We show that clinically analysed variants indicated as rare in dbSNP and the Exome Variant Server remain rare in ExAC, and that genes discovered as a cause of IRD in the post-NGS era are rare causes of IRD in a population of clinically surveyed individuals. Conclusions Our findings illustrate the continued powerful utility of custom-gene panel diagnostic NGS tests for IRD in the clinic, but suggest clear future avenues for increasing diagnostic yields.
DNA Research | 2014
Meeta Sunil; Arun K. Hariharan; Soumya Nayak; Saurabh Gupta; Suran R. Nambisan; Ravi P. Gupta; Binay Panda; Bibha Choudhary; Subhashini Srinivasan
Grain amaranths, edible C4 dicots, produce pseudo-cereals high in lysine. Lysine being one of the most limiting essential amino acids in cereals and C4 photosynthesis being one of the most sought-after phenotypes in protein-rich legume crops, the genome of one of the grain amaranths is likely to play a critical role in crop research. We have sequenced the genome and transcriptome of Amaranthus hypochondriacus, a diploid (2n = 32) belonging to the order Caryophyllales with an estimated genome size of 466 Mb. Of the 411 linkage single-nucleotide polymorphisms (SNPs) reported for grain amaranths, 355 SNPs (86%) are represented in the scaffolds and 74% of the 8.6 billion bases of the sequenced transcriptome map to the genomic scaffolds. The genome of A. hypochondriacus, codes for at least 24,829 proteins, shares the paleohexaploidy event with species under the superorders Rosids and Asterids, harbours 1 SNP in 1,000 bases, and contains 13.76% of repeat elements. Annotation of all the genes in the lysine biosynthetic pathway using comparative genomics and expression analysis offers insights into the high-lysine phenotype. As the first grain species under Caryophyllales and the first C4 dicot genome reported, the work presented here will be beneficial in improving crops and in expanding our understanding of angiosperm evolution.
PLOS ONE | 2012
Swetansu Pattnaik; Srividya Vaidyanathan; Durgad G. Pooja; Sa Deepak; Binay Panda
The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.
BMC Bioinformatics | 2014
Swetansu Pattnaik; Saurabh Gupta; Arjun A. Rao; Binay Panda
BackgroundThe rapid advancements in the field of genome sequencing are aiding our understanding on many biological systems. In the last five years, computational biologists and bioinformatics specialists have come up with newer, better and more efficient tools towards the discovery, analysis and interpretation of different genomic variants from high-throughput sequencing data. Availability of reliable simulated dataset is essential and is the first step towards testing any newly developed analytical tools for variant discovery. Although there are tools currently available that can simulate variants, none present the possibility of simulating all the three major types of variations (Single Nucleotide Polymorphisms, Insertions and Deletions and Copy Number Variations) and can generate reads taking a realistic error-model into consideration. Therefore, an efficient simulator and read generator is needed that can simulate variants taking the error rates of true biological samples into consideration.ResultsWe report SInC (Snp, Indel and Cnv) an open-source variant simulator and read generator capable of simulating all the three common types of biological variants taking into account a distribution of base quality score from a most commonly used next-generation sequencing instrument from Illumina. SInC is capable of generating single- and paired-end reads with user-defined insert size and with high efficiency compared to the other existing tools. SInC, due to its multi-threaded capability during read generation, has a low time footprint. SInC is currently optimised to work in limited infrastructure setup and can efficiently exploit the commonly used quad-core desktop architecture to simulate short sequence reads with deep coverage for large genomes.ConclusionsWe have come up with a user-friendly multi-variant simulator and read-generator tools called SInC. SInC can be downloaded from http://sourceforge.net/projects/sincsimulator.
PeerJ | 2013
Prachi Jain; Neeraja M. Krishnan; Binay Panda
Researchers interested in studying and constructing transcriptomes, especially for non-model species, face the conundrum of choosing from a number of available de novo and genome-guided assemblers. None of the popular assembly tools in use today achieve requisite sensitivity, specificity or recovery of full-length transcripts on their own. Here, we present a comprehensive comparative study of the performance of various assemblers. Additionally, we present an approach to combinatorially augment transciptome assembly by using both de novo and genome-guided tools. In our study, we obtained the best recovery and most full-length transcripts with Trinity and TopHat1-Cufflinks, respectively. The sensitivity of the assembly and isoform recovery was superior, without compromising much on the specificity, when transcripts from Trinity were augmented with those from TopHat1-Cufflinks.
PLOS ONE | 2012
Neeraja M. Krishnan; Prakhar Gaur; Rakshit Chaudhary; Arjun A. Rao; Binay Panda
Copy Number Alterations (CNAs) such as deletions and duplications; compose a larger percentage of genetic variations than single nucleotide polymorphisms or other structural variations in cancer genomes that undergo major chromosomal re-arrangements. It is, therefore, imperative to identify cancer-specific somatic copy number alterations (SCNAs), with respect to matched normal tissue, in order to understand their association with the disease. We have devised an accurate, sensitive, and easy-to-use tool, COPS, COpy number using Paired Samples, for detecting SCNAs. We rigorously tested the performance of COPS using short sequence simulated reads at various sizes and coverage of SCNAs, read depths, read lengths and also with real tumor:normal paired samples. We found COPS to perform better in comparison to other known SCNA detection tools for all evaluated parameters, namely, sensitivity (detection of true positives), specificity (detection of false positives) and size accuracy. COPS performed well for sequencing reads of all lengths when used with most upstream read alignment tools. Additionally, by incorporating a downstream boundary segmentation detection tool, the accuracy of SCNA boundaries was further improved. Here, we report an accurate, sensitive and easy to use tool in detecting cancer-specific SCNAs using short-read sequence data. In addition to cancer, COPS can be used for any disease as long as sequence reads from both disease and normal samples from the same individual are available. An added boundary segmentation detection module makes COPS detected SCNA boundaries more specific for the samples studied. COPS is available at ftp://115.119.160.213 with username “cops” and password “cops”.
European Journal of Human Genetics | 2017
Jamie M Ellingford; Christopher Campbell; Stephanie Barton; Sanjeev Bhaskar; Saurabh Gupta; Rachel L Taylor; Panagiotis I. Sergouniotis; Bradley Horn; Janine A. Lamb; Michel Michaelides; Andrew R. Webster; William G. Newman; Binay Panda; Simon C. Ramsden; Graeme C.M. Black
Although a common cause of disease, copy number variants (CNVs) have not routinely been identified from next-generation sequencing (NGS) data in a clinical context. This study aimed to examine the sensitivity and specificity of a widely used software package, ExomeDepth, to identify CNVs from targeted NGS data sets. We benchmarked the accuracy of CNV detection using ExomeDepth v1.1.6 applied to targeted NGS data sets by comparison to CNV events detected through whole-genome sequencing for 25 individuals and determined the sensitivity and specificity of ExomeDepth applied to these targeted NGS data sets to be 100% and 99.8%, respectively. To define quality assurance metrics for CNV surveillance through ExomeDepth, we undertook simulation of single-exon (n=1000) and multiple-exon heterozygous deletion events (n=1749), determining a sensitivity of 97% (n=2749). We identified that the extent of sequencing coverage, the inter- and intra-sample variability in the depth of sequencing coverage and the composition of analysis regions are all important determinants of successful CNV surveillance through ExomeDepth. We then applied these quality assurance metrics during CNV surveillance for 140 individuals across 12 distinct clinical areas, encompassing over 500 potential rare disease diagnoses. All 140 individuals lacked molecular diagnoses after routine clinical NGS testing, and by application of ExomeDepth, we identified 17 CNVs contributing to the cause of a Mendelian disorder. Our findings support the integration of CNV detection using ExomeDepth v1.1.6 with routine targeted NGS diagnostic services for Mendelian disorders. Implementation of this strategy increases diagnostic yields and enhances clinical care.
F1000Research | 2015
Neeraja M. Krishnan; Saurabh Gupta; Vinayak Palve; Linu Varghese; Swetansu Pattnaik; Prachi Jain; Costerwell Khyriem; Arun K. Hariharan; Kunal Dhas; Jayalakshmi Nair; Manisha Pareek; Venkatesh K Prasad; Gangotri Siddappa; Amritha Suresh; Vikram Kekatpure; Moni Abraham Kuriakose; Binay Panda
Oral tongue squamous cell carcinomas (OTSCC) are a homogeneous group of tumors characterized by aggressive behavior, early spread to lymph nodes and a higher rate of regional failure. Additionally, the incidence of OTSCC among younger population (<50yrs) is on the rise; many of whom lack the typical associated risk factors of alcohol and/or tobacco exposure. We present data on single nucleotide variations (SNVs), indels, regions with loss of heterozygosity (LOH), and copy number variations (CNVs) from fifty-paired oral tongue primary tumors and link the significant somatic variants with clinical parameters, epidemiological factors including human papilloma virus (HPV) infection and tumor recurrence. Apart from the frequent somatic variants harbored in TP53, CASP8, RASA1, NOTCH and CDKN2A genes, significant amplifications and/or deletions were detected in chromosomes 6-9, and 11 in the tumors. Variants in CASP8 and CDKN2A were mutually exclusive. CDKN2A, PIK3CA, RASA1 and DMD variants were exclusively linked to smoking, chewing, HPV infection and tumor stage. We also performed a whole-genome gene expression study that identified matrix metalloproteases to be highly expressed in tumors and linked pathways involving arachidonic acid and NF-k-B to habits and distant metastasis, respectively. Functional knockdown studies in cell lines demonstrated the role of CASP8 in a HPV-negative OTSCC cell line. Finally, we identified a 38-gene minimal signature that predicts tumor recurrence using an ensemble machine-learning method. Taken together, this study links molecular signatures to various clinical and epidemiological factors in a homogeneous tumor population with a relatively high HPV prevalence.
Molecular Cancer Research | 2016
Neeraja M. Krishnan; Kunal Dhas; Jayalakshmi Nair; Vinayak Palve; Jamir Bagwan; Gangotri Siddappa; Amritha Suresh; Vikram Kekatpure; Moni Abraham Kuriakose; Binay Panda
Oral tongue squamous cell carcinomas (OTSCC) are a homogenous group of aggressive tumors in the head and neck region that spread early to lymph nodes and have a higher incidence of regional failure. In addition, there is a rising incidence of oral tongue cancer in younger populations. Studies on functional DNA methylation changes linked with altered gene expression are critical for understanding the mechanisms underlying tumor development and metastasis. Such studies also provide important insight into biomarkers linked with viral infection, tumor metastasis, and patient survival in OTSCC. Therefore, we performed genome-wide methylation analysis of tumors (N = 52) and correlated altered methylation with differential gene expression. The minimal tumor-specific DNA 5-methylcytosine signature identified genes near 16 different differentially methylated regions, which were validated using genomic data from The Cancer Genome Atlas cohort. In our cohort, hypermethylation of MIR10B was significantly associated with the differential expression of its target genes NR4A3 and BCL2L11 (P = 0.0125 and P = 0.014, respectively), which was inversely correlated with disease-free survival (P = 9E−15 and P = 2E−15, respectively) in patients. Finally, differential methylation in FUT3, TRIM5, TSPAN7, MAP3K8, RPS6KA2, SLC9A9, and NPAS3 genes was found to be predictive of certain clinical and epidemiologic parameters. Implications: This study reveals a functional minimal methylation profile in oral tongue tumors with associated risk habits, clinical, and epidemiologic outcomes. In addition, NR4A3 downregulation and correlation with patient survival suggests a potential target for therapeutic intervention in oral tongue tumors. Data from the current study are deposited in the NCBI Geo database (accession number GSE75540). Mol Cancer Res; 14(9); 805–19. ©2016 AACR.