Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andreas Futschik is active.

Publication


Featured researches published by Andreas Futschik.


PLOS ONE | 2011

PoPoolation: A Toolbox for Population Genetic Analysis of Next Generation Sequencing Data from Pooled Individuals

Robert Kofler; Pablo Orozco-terWengel; Nicola De Maio; Ram Vinay Pandey; Viola Nolte; Andreas Futschik; Carolin Kosiol; Christian Schlötterer

Recent statistical analyses suggest that sequencing of pooled samples provides a cost effective approach to determine genome-wide population genetic parameters. Here we introduce PoPoolation, a toolbox specifically designed for the population genetic analysis of sequence data from pooled individuals. PoPoolation calculates estimates of θ Watterson, θ π, and Tajimas D that account for the bias introduced by pooling and sequencing errors, as well as divergence between species. Results of genome-wide analyses can be graphically displayed in a sliding window plot. PoPoolation is written in Perl and R and it builds on commonly used data formats. Its source code can be downloaded from http://code.google.com/p/popoolation/. Furthermore, we evaluate the influence of mapping algorithms, sequencing errors, and read coverage on the accuracy of population genetic parameter estimates from pooled data.


Genetics | 2010

The Next Generation of Molecular Markers From Massively Parallel Sequencing of Pooled DNA Samples

Andreas Futschik; Christian Schlötterer

Next generation sequencing (NGS) is about to revolutionize genetic analysis. Currently NGS techniques are mainly used to sequence individual genomes. Due to the high sequence coverage required, the costs for population-scale analyses are still too high to allow an extension to nonmodel organisms. Here, we show that NGS of pools of individuals is often more effective in SNP discovery and provides more accurate allele frequency estimates, even when taking sequencing errors into account. We modify the population genetic estimators Tajimas π and Wattersons θ to obtain unbiased estimates from NGS pooling data. Given the same sequencing effort, the resulting estimators often show a better performance than those obtained from individual sequencing. Although our analysis also shows that NGS of pools of individuals will not be preferable under all circumstances, it provides a cost-effective approach to estimate allele frequencies on a genome-wide scale.


PLOS Genetics | 2013

A Genome-Wide, Fine-Scale Map of Natural Pigmentation Variation in Drosophila melanogaster

Héloı̈se Bastide; Andrea J. Betancourt; Viola Nolte; Raymond Tobler; Petra Stöbe; Andreas Futschik; Christian Schlötterer

Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.


Genetics | 2012

A novel approach for choosing summary statistics in approximate Bayesian computation.

Simon Aeschbacher; Mark A. Beaumont; Andreas Futschik

The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θanc = 4Neu) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L2-loss performs best. Applying that method to the ibex data, we estimate θ^anc≈1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10−4 and 3.5 × 10−3 per locus per generation. The proportion of males with access to matings is estimated as ω^≈0.21, which is in good agreement with recent independent estimates.


Molecular Biology and Evolution | 2012

Detecting Selective Sweeps from Pooled Next-Generation Sequencing Samples

Simon Boitard; Christian Schlötterer; Viola Nolte; Ram Vinay Pandey; Andreas Futschik

Due to its cost effectiveness, next-generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for characterizing variation in population samples. Because Pool-Seq provides genome-wide SNP frequency data, it is possible to use them for demographic inference and/or the identification of selective sweeps. Here, we introduce a statistical method that is designed to detect selective sweeps from pooled data by accounting for statistical challenges associated with Pool-Seq, namely sequencing errors and random sampling among chromosomes. This allows for an efficient use of the information: all base calls are included in the analysis, but the higher credibility of regions with higher coverage and base calls with better quality scores is accounted for. Computer simulations show that our method efficiently detects sweeps even at very low coverage (0.5× per chromosome). Indeed, the power of detecting sweeps is similar to what we could expect from sequences of individual chromosomes. Since the inference of selective sweeps is based on the allele frequency spectrum (AFS), we also provide a method to accurately estimate the AFS provided that the quality scores for the sequence reads are reliable. Applying our approach to Pool-Seq data from Drosophila melanogaster, we identify several selective sweep signatures on chromosome X that include some previously well-characterized sweeps like the wapl region.


Genetics | 2009

Detecting Selective Sweeps: A New Approach Based on Hidden Markov Models

Sim On Boitard; Christian Schlötterer; Andreas Futschik

Detecting and localizing selective sweeps on the basis of SNP data has recently received considerable attention. Here we introduce the use of hidden Markov models (HMMs) for the detection of selective sweeps in DNA sequences. Like previously published methods, our HMMs use the site frequency spectrum, and the spatial pattern of diversity along the sequence, to identify selection. In contrast to earlier approaches, our HMMs explicitly model the correlation structure between linked sites. The detection power of our methods, and their accuracy for estimating the selected site location, is similar to that of competing methods for constant size populations. In the case of population bottlenecks, however, our methods frequently showed fewer false positives.


Genetics | 2011

Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics

Kao Lin; Haipeng Li; Christian Schlötterer; Andreas Futschik

Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajimas π. Overall, Wattersons θ was found to contribute the most information for distinguishing between bottlenecks and selection.


Molecular Ecology Resources | 2013

Pool-hmm: a Python program for estimating the allele frequency spectrum and detecting selective sweeps from next generation sequencing of pooled samples

Simon Boitard; Robert Kofler; Pierre Françoise; David Robelin; Christian Schlötterer; Andreas Futschik

Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool‐Seq) is becoming a popular strategy for genome‐wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool‐seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool‐Seq data. We introduce Pool‐hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool‐Seq sample. Pool‐hmm includes several options that allow a flexible analysis of Pool‐Seq data, and can be run in parallel on several processors. Source code and documentation for Pool‐hmm is freely available at https://qgsp.jouy.inra.fr/.


Molecular Ecology Resources | 2013

Allelic imbalance metre (Allim), a new tool for measuring allele-specific gene expression with RNA-seq data

Ram Vinay Pandey; Susanne U. Franssen; Andreas Futschik; Christian Schlötterer

Estimating differences in gene expression among alleles is of high interest for many areas in biology and medicine. Here, we present a user‐friendly software tool, Allim, to estimate allele‐specific gene expression. Because mapping bias is a major problem for reliable estimates of allele‐specific gene expression using RNA‐seq, Allim combines two different strategies to account for the mapping biases. In order to reduce the mapping bias, Allim first generates a polymorphism‐aware reference genome that accounts for the sequence variation between the alleles. Then, a sequence‐specific simulation tool estimates the residual mapping bias. Statistical tests for allelic imbalance are provided that can be used with the bias corrected RNA‐seq data.


Genetics | 2007

Locating multiple interacting quantitative trait loci using rank-based model selection

Małgorzata Zak; Andreas Baierl; Małgorzata Bogdan; Andreas Futschik

In previous work, a modified version of the Bayesian information criterion (mBIC) was proposed to locate multiple interacting quantitative trait loci (QTL). Simulation studies and real data analysis demonstrate good properties of the mBIC in situations where the error distribution is approximately normal. However, as with other standard techniques of QTL mapping, the performance of the mBIC strongly deteriorates when the trait distribution is heavy tailed or when the data contain a significant proportion of outliers. In the present article, we propose a suitable robust version of the mBIC that is based on ranks. We investigate the properties of the resulting method on the basis of theoretical calculations, computer simulations, and a real data analysis. Our simulation results show that for the sample sizes typically used in QTL mapping, the methods based on ranks are almost as efficient as standard techniques when the data are normal and are much better when the data come from some heavy-tailed distribution or include a proportion of outliers.

Collaboration


Dive into the Andreas Futschik's collaboration.

Top Co-Authors

Avatar

Christian Schlötterer

University of Veterinary Medicine Vienna

View shared research outputs
Top Co-Authors

Avatar

Susanne Waiblinger

University of Veterinary Medicine Vienna

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rupert Palme

University of Veterinary Medicine Vienna

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ram Vinay Pandey

Austrian Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Haipeng Li

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Kao Lin

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Researchain Logo
Decentralizing Knowledge