Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Barbara Rakitsch is active.

Publication


Featured researches published by Barbara Rakitsch.


Bioinformatics | 2013

A Lasso multi-marker mixed model for association mapping with population structure correction

Barbara Rakitsch; Christoph Lippert; Oliver Stegle; Karsten M. Borgwardt

MOTIVATION Exploring the genetic basis of heritable traits remains one of the central challenges in biomedical research. In traits with simple Mendelian architectures, single polymorphic loci explain a significant fraction of the phenotypic variability. However, many traits of interest seem to be subject to multifactorial control by groups of genetic loci. Accurate detection of such multivariate associations is non-trivial and often compromised by limited statistical power. At the same time, confounding influences, such as population structure, cause spurious association signals that result in false-positive findings. RESULTS We propose linear mixed models LMM-Lasso, a mixed model that allows for both multi-locus mapping and correction for confounding effects. Our approach is simple and free of tuning parameters; it effectively controls for population structure and scales to genome-wide datasets. LMM-Lasso simultaneously discovers likely causal variants and allows for multi-marker-based phenotype prediction from genotype. We demonstrate the practical use of LMM-Lasso in genome-wide association studies in Arabidopsis thaliana and linkage mapping in mouse, where our method achieves significantly more accurate phenotype prediction for 91% of the considered phenotypes. At the same time, our model dissects the phenotypic variability into components that result from individual single nucleotide polymorphism effects and population structure. Enrichment of known candidate genes suggests that the individual associations retrieved by LMM-Lasso are likely to be genuine. AVAILABILITY Code available under http://webdav.tuebingen. mpg.de/u/karsten/Forschung/research.html. CONTACT [email protected], [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Nature Methods | 2015

Efficient set tests for the genetic analysis of correlated traits

Francesco Paolo Casale; Barbara Rakitsch; Christoph Lippert; Oliver Stegle

Set tests are a powerful approach for genome-wide association testing between groups of genetic variants and quantitative traits. We describe mtSet (http://github.com/PMBio/limix), a mixed-model approach that enables joint analysis across multiple correlated traits while accounting for population structure and relatedness. mtSet effectively combines the benefits of set tests with multi-trait modeling and is computationally efficient, enabling genetic analysis of large cohorts (up to 500,000 individuals) and multiple traits.


bioRxiv | 2014

LIMIX: genetic analysis of multiple traits

Christoph Lippert; Francesco Paolo Casale; Barbara Rakitsch; Oliver Stegle

Multi-trait mixed models have emerged as a promising approach for joint analyses of multiple traits. In principle, the mixed model framework is remarkably general. However, current methods implement only a very specific range of tasks to optimize the necessary computations. Here, we present a multi-trait modeling framework that is versatile and fast: LIMIX enables to flexibly adapt mixed models for a broad range of applications with different observed and hidden covariates, and variable study designs. To highlight the novel modeling aspects of LIMIX we performed three vastly different genetic studies: joint GWAS of correlated blood lipid phenotypes, joint analysis of the expression levels of the multiple transcript-isoforms of a gene, and pathway-based modeling of molecular traits across environments. In these applications we show that LIMIX increases GWAS power and phenotype prediction accuracy, in particular when integrating stepwise multi-locus regression into multi-trait models, and when analyzing large numbers of traits. An open source implementation of LIMIX is freely available at: https://github.com/PMBio/limix.


intelligent systems in molecular biology | 2011

ccSVM: correcting Support Vector Machines for confounding factors in biological data classification

Limin Li; Barbara Rakitsch; Karsten M. Borgwardt

Motivation: Classifying biological data into different groups is a central task of bioinformatics: for instance, to predict the function of a gene or protein, the disease state of a patient or the phenotype of an individual based on its genotype. Support Vector Machines are a wide spread approach for classifying biological data, due to their high accuracy, their ability to deal with structured data such as strings, and the ease to integrate various types of data. However, it is unclear how to correct for confounding factors such as population structure, age or gender or experimental conditions in Support Vector Machine classification. Results: In this article, we present a Support Vector Machine classifier that can correct the prediction for observed confounding factors. This is achieved by minimizing the statistical dependence between the classifier and the confounding factors. We prove that this formulation can be transformed into a standard Support Vector Machine with rescaled input data. In our experiments, our confounder correcting SVM (ccSVM) improves tumor diagnosis based on samples from different labs, tuberculosis diagnosis in patients of varying age, ethnicity and gender, and phenotype prediction in the presence of population structure and outperforms state-of-the-art methods in terms of prediction accuracy. Availability: A ccSVM-implementation in MATLAB is available from http://webdav.tuebingen.mpg.de/u/karsten/Forschung/ISMB11_ccSVM/. Contact: [email protected]; [email protected]


Proceedings of the National Academy of Sciences of the United States of America | 2016

Genetic architecture of nonadditive inheritance in Arabidopsis thaliana hybrids

Danelle K. Seymour; Eunyoung Chae; Dominik Grimm; Carmen Martín Pizarro; Anette Habring-Müller; François Vasseur; Barbara Rakitsch; Karsten M. Borgwardt; Daniel Koenig; Detlef Weigel

Significance Hybrid progeny of inbred parents are often more fit than their parents. Such hybrid vigor, or heterosis, is the focus of many plant breeding programs, and the rewards are evident. Hybrid maize has for many decades accounted for the majority of seed planted each year in North America and Europe. Despite the prevalence of this phenomenon and its agricultural importance, the genetic basis of heterotic traits is still unclear. We have used a large collection of first-generation hybrids in Arabidopsis thaliana to characterize the genetics of heterosis in this model plant. We have identified loci that contribute substantially to hybrid vigor and show that a subset of these exhibits classical dominance, an important finding with direct implications for crop improvement. The ubiquity of nonparental hybrid phenotypes, such as hybrid vigor and hybrid inferiority, has interested biologists for over a century and is of considerable agricultural importance. Although examples of both phenomena have been subject to intense investigation, no general model for the molecular basis of nonadditive genetic variance has emerged, and prediction of hybrid phenotypes from parental information continues to be a challenge. Here we explore the genetics of hybrid phenotype in 435 Arabidopsis thaliana individuals derived from intercrosses of 30 parents in a half diallel mating scheme. We find that nonadditive genetic effects are a major component of genetic variation in this population and that the genetic basis of hybrid phenotype can be mapped using genome-wide association (GWA) techniques. Significant loci together can explain as much as 20% of phenotypic variation in the surveyed population and include examples that have both classical dominant and overdominant effects. One candidate region inherited dominantly in the half diallel contains the gene for the MADS-box transcription factor AGAMOUS-LIKE 50 (AGL50), which we show directly to alter flowering time in the predicted manner. Our study not only illustrates the promise of GWA approaches to dissect the genetic architecture underpinning hybrid performance but also demonstrates the contribution of classical dominance to genetic variance.


Molecular Biology and Evolution | 2016

Genomic Profiles of Diversification and Genotype–Phenotype Association in Island Nematode Lineages

Angela McGaughran; Christian Rödelsperger; Dominik Grimm; Jan M. Meyer; Eduardo Moreno; Katy Morgan; Mark Leaver; Vahan Serobyan; Barbara Rakitsch; Karsten M. Borgwardt; Ralf J. Sommer

Understanding how new species form requires investigation of evolutionary forces that cause phenotypic and genotypic changes among populations. However, the mechanisms underlying speciation vary and little is known about whether genomes diversify in the same ways in parallel at the incipient scale. We address this using the nematode, Pristionchus pacificus, which resides at an interesting point on the speciation continuum (distinct evolutionary lineages without reproductive isolation), and inhabits heterogeneous environments subject to divergent environmental pressures. Using whole genome re-sequencing of 264 strains, we estimate FST to identify outlier regions of extraordinary differentiation (∼1.725 Mb of the 172.5 Mb genome). We find evidence for shared divergent genomic regions occurring at a higher frequency than expected by chance among populations of the same evolutionary lineage. We use allele frequency spectra to find that, among lineages, 53% of divergent regions are consistent with adaptive selection, whereas 24% and 23% of such regions suggest background selection and restricted gene flow, respectively. In contrast, among populations from the same lineage, similar proportions (34-48%) of divergent regions correspond to adaptive selection and restricted gene flow, whereas 13-22% suggest background selection. Because speciation often involves phenotypic and genomic divergence, we also evaluate phenotypic variation, focusing on pH tolerance, which we find is diverging in a manner corresponding to environmental differences among populations. Taking a genome-wide association approach, we functionally validate a significant genotype-phenotype association for this trait. Our results are consistent with P. pacificus undergoing heterogeneous genotypic and phenotypic diversification related to both evolutionary and environmental processes.


Genome Biology | 2016

Modelling local gene networks increases power to detect trans -acting genetic effects on gene expression

Barbara Rakitsch; Oliver Stegle

Expression quantitative trait loci (eQTL) mapping is a widely used tool to study the genetics of gene expression. Confounding factors and the burden of multiple testing limit the ability to map distal trans eQTLs, which is important to understand downstream genetic effects on genes and pathways. We propose a two-stage linear mixed model that first learns local directed gene-regulatory networks to then condition on the expression levels of selected genes. We show that this covariate selection approach controls for confounding factors and regulatory context, thereby increasing eQTL detection power and improving the consistency between studies. GNet-LMM is available at: https://github.com/PMBio/GNetLMM.


PLOS Genetics | 2017

Joint genetic analysis using variant sets reveals polygenic gene-context interactions

Francesco Paolo Casale; Danilo Horta; Barbara Rakitsch; Oliver Stegle

Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait models have been designed to increase power for detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed models to test for interactions between sets of variants and environmental states or other contexts. Our model generalizes previous interaction tests and in particular provides a test for local differences in the genetic architecture between contexts. We first use simulations to validate iSet before applying the model to the analysis of genotype-environment interactions in an eQTL study. Our model retrieves a larger number of interactions than alternative methods and reveals that up to 20% of cases show context-specific configurations of causal variants. Finally, we apply iSet to test for sub-group specific genetic effects in human lipid levels in a large human cohort, where we identify a gene-sex interaction for C-reactive protein that is missed by alternative methods.


bioRxiv | 2016

Bootstrat: Population Informed Bootstrapping for Rare Variant Tests

Hailiang Huang; Gina M. Peloso; Daniel P. Howrigan; Barbara Rakitsch; Carl-Johann Simon-Gabriel; Jacqueline I. Goldstein; Mark J. Daly; Karsten M. Borgwardt; Benjamin M. Neale

Recent advances in genotyping and sequencing technologies have made detecting rare variants in large cohorts possible. Various analytic methods for associating disease to rare variants have been proposed, including burden tests, C-alpha and SKAT. Most of these methods, however, assume that samples come from a homogeneous population, which is not realistic for analyses of large samples. Not correcting for population stratification causes inflated p-values and false-positive associations. Here we propose a population-informed bootstrap resampling method that controls for population stratification (Bootstrat) in rare variant tests. In essence, the Bootstrat procedure uses genetic distance to create a phenotype probability for each sample. We show that this empirical approach can effectively correct for population stratification while maintaining statistical power comparable to established methods of controlling for population stratification. The Bootstrat scheme can be easily applied to existing rare variant testing methods with reasonable computational complexity. Author Summary Recent technology advances have enabled large-scale analysis of rare variants, but properly testing rare variants remains a significant challenge as most rare variant testing methods assume a sample of homogenous ethnicity, an assumption often not true for large cohorts. Failure to account for this heterogeneity increases the type I error rate. Here we propose a bootstrap scheme applicable to most existing rare variant testing methods to control for population heterogeneity. This scheme uses a randomization layer to establish a null distribution of the test statistics while preserving the sample genetic relationships. The null distribution is then used to calculate an empirical p-value that accounts for population heterogeneity. We demonstrate how this scheme successfully controls the type I error rate without loss of statistical power.


neural information processing systems | 2013

It is all in the noise: Efficient multi-task Gaussian process inference with structured residuals

Barbara Rakitsch; Christoph Lippert; Karsten M. Borgwardt; Oliver Stegle

Collaboration


Dive into the Barbara Rakitsch's collaboration.

Top Co-Authors

Avatar

Oliver Stegle

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Francesco Paolo Casale

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge