Ram Vinay Pandey
Austrian Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ram Vinay Pandey.
Bioinformatics | 2011
Robert Kofler; Ram Vinay Pandey; Christian Schlötterer
Summary: Sequencing pooled DNA samples (Pool-Seq) is the most cost-effective approach for the genome-wide comparison of population samples. Here, we introduce PoPoolation2, the first software tool specifically designed for the comparison of populations with Pool-Seq data. PoPoolation2 implements a range of commonly used measures of differentiation (FST, Fishers exact test and Cochran-Mantel-Haenszel test) that can be applied on different scales (windows, genes, exons, SNPs). The result may be visualized with the widely used Integrated Genomics Viewer. Availability and Implementation: PoPoolation2 is implemented in Perl and R. It is freely available on http://code.google.com/p/popoolation2/ Contact: [email protected] Supplementary Information: Manual: http://code.google.com/p/popoolation2/wiki/Manual Test data and tutorial: http://code.google.com/p/popoolation2/wiki/Tutorial Validation: http://code.google.com/p/popoolation2/wiki/Validation
PLOS ONE | 2011
Robert Kofler; Pablo Orozco-terWengel; Nicola De Maio; Ram Vinay Pandey; Viola Nolte; Andreas Futschik; Carolin Kosiol; Christian Schlötterer
Recent statistical analyses suggest that sequencing of pooled samples provides a cost effective approach to determine genome-wide population genetic parameters. Here we introduce PoPoolation, a toolbox specifically designed for the population genetic analysis of sequence data from pooled individuals. PoPoolation calculates estimates of θ Watterson, θ π, and Tajimas D that account for the bias introduced by pooling and sequencing errors, as well as divergence between species. Results of genome-wide analyses can be graphically displayed in a sliding window plot. PoPoolation is written in Perl and R and it builds on commonly used data formats. Its source code can be downloaded from http://code.google.com/p/popoolation/. Furthermore, we evaluate the influence of mapping algorithms, sequencing errors, and read coverage on the accuracy of population genetic parameter estimates from pooled data.
Molecular Ecology | 2010
Ralph Medinger; Viola Nolte; Ram Vinay Pandey; Steffen Jost; Birgit Ottenwälder; Christian Schlötterer; Jens Boenigk
With the delivery of millions of sequence reads in a single experiment, next‐generation sequencing (NGS) is currently revolutionizing surveys of microorganism diversity. In particular, when applied to Eukaryotes, we are still lacking a rigorous comparison of morphological and NGS‐based diversity estimates. In this report, we studied the diversity and the seasonal community turnover of alveolates (Ciliophora and Dinophyceae) in an oligotrophic freshwater lake by SSU amplicon sequencing with NGS as well as by classical morphological analysis. We complemented the morphological analysis by single‐cell PCR followed by Sanger sequencing to provide an unambiguous link to the NGS data. We show that NGS and morphological analyses generally capture frequency shifts of abundant taxa over our seasonal samples. The observed incongruencies are probably largely due to rDNA copy number variation among taxa and heterogeneity in the efficiency of cell lysis. Overall, NGS‐based amplicon sequencing was superior in detecting rare species. We propose that in the absence of other nuclear markers less susceptible to copy number variation, rDNA‐based diversity studies need to be adjusted for confounding effects of copy number variation.
Molecular Ecology | 2010
Viola Nolte; Ram Vinay Pandey; Steffen Jost; Ralph Medinger; Birgit Ottenwälder; Jens Boenigk; Christian Schlötterer
With the advent of molecular methods, it became clear that microbial biodiversity had been vastly underestimated. Since then, species abundance patterns were determined for several environments, but temporal changes in species composition were not studied to the same level of resolution. Using massively parallel sequencing on the 454 GS FLX platform we identified a highly dynamic turnover of the seasonal abundance of protists in the Austrian lake Fuschlsee. We show that seasonal abundance patterns of protists closely match their biogeographic distribution. The stable predominance of few highly abundant taxa, which previously led to the suggestion of a low global protist species richness, is contrasted by a highly dynamic turnover of rare species. We suggest that differential seasonality of rare and abundant protist taxa explains the—so far—conflicting evidence in the ‘everything is everywhere’ dispute. Consequently temporal sampling is basic for adequate diversity and species richness estimates.
Molecular Biology and Evolution | 2012
Simon Boitard; Christian Schlötterer; Viola Nolte; Ram Vinay Pandey; Andreas Futschik
Due to its cost effectiveness, next-generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for characterizing variation in population samples. Because Pool-Seq provides genome-wide SNP frequency data, it is possible to use them for demographic inference and/or the identification of selective sweeps. Here, we introduce a statistical method that is designed to detect selective sweeps from pooled data by accounting for statistical challenges associated with Pool-Seq, namely sequencing errors and random sampling among chromosomes. This allows for an efficient use of the information: all base calls are included in the analysis, but the higher credibility of regions with higher coverage and base calls with better quality scores is accounted for. Computer simulations show that our method efficiently detects sweeps even at very low coverage (0.5× per chromosome). Indeed, the power of detecting sweeps is similar to what we could expect from sequences of individual chromosomes. Since the inference of selective sweeps is based on the allele frequency spectrum (AFS), we also provide a method to accurately estimate the AFS provided that the quality scores for the sequence reads are reliable. Applying our approach to Pool-Seq data from Drosophila melanogaster, we identify several selective sweep signatures on chromosome X that include some previously well-characterized sweeps like the wapl region.
Genome Research | 2013
Viola Nolte; Ram Vinay Pandey; Robert Kofler; Christian Schlötterer
Although it is well understood that selection shapes the polymorphism pattern in Drosophila, signatures of classic selective sweeps are scarce. Here, we focus on Drosophila mauritiana, an island endemic, which is closely related to Drosophila melanogaster. Based on a new, annotated genome sequence, we characterized the genome-wide polymorphism by sequencing pooled individuals (Pool-seq). We show that the interplay between selection and recombination results in a genome-wide polymorphism pattern characteristic for D. mauritiana. Two large genomic regions (>500 kb) showed the signature of almost complete selective sweeps. We propose that the absence of population structure and limited geographic distribution could explain why such pronounced sweep patterns are restricted to D. mauritiana. Further evidence for strong adaptive evolution was detected for several nucleoporin genes, some of which were not previously identified as genes involved in genomic conflict. Since this adaptive evolution is continuing after the split of D. mauritiana and Drosophila simulans, we conclude that genomic conflict is not restricted to short episodes, but rather an ongoing process in Drosophila.
BMC Research Notes | 2010
Ram Vinay Pandey; Viola Nolte; Christian Schlötterer
BackgroundNext generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced. In addition to the use in whole genome sequencing, the 454 GS-FLX platform is becoming a widely used tool for biodiversity surveys based on amplicon sequencing. In order to use NGS for biodiversity surveys, software tools are required, which perform quality control, trimming of the sequence reads, removal of PCR primers, and generation of input files for downstream analyses. A user-friendly software utility that carries out these steps is still lacking.FindingsWe developed CANGS (C leaning and A nalyzing N ext G eneration S equences) a flexible and user-friendly integrated software utility: CANGS is designed for amplicon based biodiversity surveys using the 454 sequencing platform. CANGS filters low quality sequences, removes PCR primers, filters singletons, identifies barcodes, and generates input files for downstream analyses. The downstream analyses rely either on third party software (e.g.: rarefaction analyses) or CANGS-specific scripts. The latter include modules linking 454 sequences with the name of the closest taxonomic reference retrieved from the NCBI database and the sequence divergence between them. Our software can be easily adapted to handle sequencing projects with different amplicon sizes, primer sequences, and quality thresholds, which makes this software especially useful for non-bioinformaticians.ConclusionCANGS performs PCR primer clipping, filtering of low quality sequences, links sequences to NCBI taxonomy and provides input files for common rarefaction analysis software programs. CANGS is written in Perl and runs on Mac OS X/Linux and is available at http://i122server.vu-wien.ac.at/pop/software.html
PLOS ONE | 2013
Ram Vinay Pandey; Christian Schlötterer
With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/
Bioinformatics | 2007
Ashwini Bhasi; Ram Vinay Pandey; Suriya Prabha Utharasamy; Periannan Senapathy
MOTIVATION Despite increased availability of genome annotation data, a comprehensive resource for in-depth analysis of splice signal distributions and alternative splicing (AS) patterns in eukaryote genomes is still lacking. To meet this need, we have developed EuSplice--a unique splice-centric database which provides reliable splice signal and AS information for 23 eukaryotes. RESULTS The EuSplice database contains 95,822 AS events and 2.1 million splice signals associated with over 270,000 protein-coding genes. The intuitive, user-friendly EuSplice web interface has powerful data mining and graphics capabilities for inter-genomic comparative analysis of splice signals, putative cryptic splice sites and AS events. Moreover, the seamless integration of splicing data to extensive gene-specific annotations, such as homolog annotations, functional information, mutations and sequence details makes EuSplice a powerful one-stop information resource for investigating the molecular mechanisms of complex splicing events, disease associations and the evolution of splicing in eukaryotes. AVAILABILITY http://66.170.16.154/EuSplice. SUPPLEMENTARY INFORMATION Supplementary tables and figures at Bioinfo online.
Molecular Ecology Resources | 2013
Ram Vinay Pandey; Susanne U. Franssen; Andreas Futschik; Christian Schlötterer
Estimating differences in gene expression among alleles is of high interest for many areas in biology and medicine. Here, we present a user‐friendly software tool, Allim, to estimate allele‐specific gene expression. Because mapping bias is a major problem for reliable estimates of allele‐specific gene expression using RNA‐seq, Allim combines two different strategies to account for the mapping biases. In order to reduce the mapping bias, Allim first generates a polymorphism‐aware reference genome that accounts for the sequence variation between the alleles. Then, a sequence‐specific simulation tool estimates the residual mapping bias. Statistical tests for allelic imbalance are provided that can be used with the bias corrected RNA‐seq data.