Timothy M. Beissinger
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Timothy M. Beissinger.
Genetics | 2013
Timothy M. Beissinger; Candice N. Hirsch; Rajandeep S. Sekhon; Jillian M. Foerster; James M. Johnson; German Muttoni; Brieanne Vaillancourt; C. Robin Buell; Shawn M. Kaeppler; Natalia de Leon
Genotyping-by-sequencing (GBS) approaches provide low-cost, high-density genotype information. However, GBS has unique technical considerations, including a substantial amount of missing data and a nonuniform distribution of sequence reads. The goal of this study was to characterize technical variation using this method and to develop methods to optimize read depth to obtain desired marker coverage. To empirically assess the distribution of fragments produced using GBS, ∼8.69 Gb of GBS data were generated on the Zea mays reference inbred B73, utilizing ApeKI for genome reduction and single-end reads between 75 and 81 bp in length. We observed wide variation in sequence coverage across sites. Approximately 76% of potentially observable cut site-adjacent sequence fragments had no sequencing reads whereas a portion had substantially greater read depth than expected, up to 2369 times the expected mean. The methods described in this article facilitate determination of sequencing depth in the context of empirically defined read depth to achieve desired marker density for genetic mapping studies.
Genetics | 2014
Timothy M. Beissinger; Candice N. Hirsch; Brieanne Vaillancourt; Shweta Deshpande; Kerrie Barry; C. Robin Buell; Shawn M. Kaeppler; Daniel Gianola; Natalia de Leon
A genome-wide scan to detect evidence of selection was conducted in the Golden Glow maize long-term selection population. The population had been subjected to selection for increased number of ears per plant for 30 generations, with an empirically estimated effective population size ranging from 384 to 667 individuals and an increase of more than threefold in the number of ears per plant. Allele frequencies at >1.2 million single-nucleotide polymorphism loci were estimated from pooled whole-genome resequencing data, and FST values across sliding windows were employed to assess divergence between the population preselection and the population postselection. Twenty-eight highly divergent regions were identified, with half of these regions providing gene-level resolution on potentially selected variants. Approximately 93% of the divergent regions do not demonstrate a significant decrease in heterozygosity, which suggests that they are not approaching fixation. Also, most regions display a pattern consistent with a soft-sweep model as opposed to a hard-sweep model, suggesting that selection mostly operated on standing genetic variation. For at least 25% of the regions, results suggest that selection operated on variants located outside of currently annotated coding regions. These results provide insights into the underlying genetic effects of long-term artificial selection and identification of putative genetic elements underlying number of ears per plant in maize.
Genetics Selection Evolution | 2015
Timothy M. Beissinger; Guilherme J. M. Rosa; Shawn M. Kaeppler; Daniel Gianola; Natalia de Leon
BackgroundHigh-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the data and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome.ResultsSimulations applying this method were performed to identify selection signatures from pooled sequencing FST data, for which allele frequencies were estimated from a pool of individuals. The relative ratio of true to false positives was twice that generated by existing techniques. A comparison of the approach to a previous study that involved pooled sequencing FST data from maize suggested that outlying windows were more clearly separated from their neighbors than when using a standard sliding window approach.ConclusionsWe have developed a novel technique to identify window boundaries for subsequent analysis protocols. When applied to selection studies based on FST data, this method provides a high discovery rate and minimizes false positives. The method is implemented in the R package GenWin, which is publicly available from CRAN.
Genetics | 2014
Candice N. Hirsch; Sherry Flint-Garcia; Timothy M. Beissinger; Steven R. Eichten; Shweta Deshpande; Kerrie Barry; Michael D. McMullen; James B. Holland; Edward S. Buckler; Nathan M. Springer; C. Robin Buell; Natalia de Leon; Shawn M. Kaeppler
Grain produced from cereal crops is a primary source of human food and animal feed worldwide. To understand the genetic basis of seed-size variation, a grain yield component, we conducted a genome-wide scan to detect evidence of selection in the maize Krug Yellow Dent long-term divergent seed-size selection experiment. Previous studies have documented significant phenotypic divergence between the populations. Allele frequency estimates for ∼3 million single nucleotide polymorphisms (SNPs) in the base population and selected populations were estimated from pooled whole-genome resequencing of 48 individuals per population. Using FST values across sliding windows, 94 divergent regions with a median of six genes per region were identified. Additionally, 2729 SNPs that reached fixation in both selected populations with opposing fixed alleles were identified, many of which clustered in two regions of the genome. Copy-number variation was highly prevalent between the selected populations, with 532 total regions identified on the basis of read-depth variation and comparative genome hybridization. Regions important for seed weight in natural variation were identified in the maize nested association mapping population. However, the number of regions that overlapped with the long-term selection experiment did not exceed that expected by chance, possibly indicating unique sources of variation between the two populations. The results of this study provide insights into the genetic elements underlying seed-size variation in maize and could also have applications for other cereal crops.
Frontiers in Genetics | 2011
Xiao-Lin Wu; Timothy M. Beissinger; Stewart Bauck; Brent Woodward; Guilherme J. M. Rosa; Kent A. Weigel; Natalia de Leon Gatti; Daniel Gianola
High-throughput computing (HTC) uses computer clusters to solve advanced computational problems, with the goal of accomplishing high-throughput over relatively long periods of time. In genomic selection, for example, a set of markers covering the entire genome is used to train a model based on known data, and the resulting model is used to predict the genetic merit of selection candidates. Sophisticated models are very computationally demanding and, with several traits to be evaluated sequentially, computing time is long, and output is low. In this paper, we present scenarios and basic principles of how HTC can be used in genomic selection, implemented using various techniques from simple batch processing to pipelining in distributed computer clusters. Various scripting languages, such as shell scripting, Perl, and R, are also very useful to devise pipelines. By pipelining, we can reduce total computing time and consequently increase throughput. In comparison to the traditional data processing pipeline residing on the central processors, performing general-purpose computation on a graphics processing unit provide a new-generation approach to massive parallel computing in genomic selection. While the concept of HTC may still be new to many researchers in animal breeding, plant breeding, and genetics, HTC infrastructures have already been built in many institutions, such as the University of Wisconsin–Madison, which can be leveraged for genomic selection, in terms of central processing unit capacity, network connectivity, storage availability, and middleware connectivity. Exploring existing HTC infrastructures as well as general-purpose computing environments will further expand our capability to meet increasing computing demands posed by unprecedented genomic data that we have today. We anticipate that HTC will impact genomic selection via better statistical models, faster solutions, and more competitive products (e.g., from design of marker panels to realized genetic gain). Eventually, HTC may change our view of data analysis as well as decision-making in the post-genomic era of selection programs in animals and plants, or in the study of complex diseases in humans.
Genetics Selection Evolution | 2012
Xiao-Lin Wu; Chuanyu Sun; Timothy M. Beissinger; Guilherme J. M. Rosa; Kent A. Weigel; Natalia de Leon Gatti; Daniel Gianola
BackgroundMost Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics.ResultsParallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes.ConclusionsParallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.
Theoretical and Applied Genetics | 2015
Jillian M. Foerster; Timothy M. Beissinger; Natalia de Leon; Shawn M. Kaeppler
Key messageNatural variation for the timing of vegetative phase change in maize is controlled by several large effect loci, one corresponding toGlossy15, a gene known for regulating juvenile tissue traits.AbstractVegetative phase change is an intrinsic component of developmental programs in plants. Juvenile and adult vegetative tissues in grasses differ dramatically in their anatomical and biochemical composition affecting the utility of specific genotypes as animal feed and biofuel feedstock. The molecular network controlling the process of developmental transition is incompletely characterized. In this study, we used scoring for juvenile and adult epicuticular wax as an entry point to discover quantitative trait loci (QTL) controlling phenotypic variation for the developmental timing of juvenile to adult transition in maize. We scored the last leaf with juvenile wax on 25 recombinant inbred line families of the B73 reference Nested Association Mapping (NAM) population and the intermated B73×Mo17 (IBM) population across multiple seasons. A total of 13 unique QTL were identified through genome-wide association analysis across the NAM populations, three of which have large effects. A QTL located on chromosome nine had the most significant SNPs within Glossy15, a gene controlling expression of juvenile leaf traits. The second large effect QTL is located on chromosome two. The most significant SNP in this QTL is located adjacent to a homolog of the Arabidopsis transcription factor, enhanced downy mildew-2, which has been shown to promote the transition from juvenile to adult vegetative phase. Overall, these results show that several major QTL and potential candidate genes underlie the extensive natural variation for this developmental trait.
G3: Genes, Genomes, Genetics | 2015
Nicholas J. Haase; Timothy M. Beissinger; Candice N. Hirsch; Brieanne Vaillancourt; Shweta Deshpande; Kerrie Barry; C. Robin Buell; Shawn M. Kaeppler; Natalia de Leon
Delayed transition from the vegetative stage to the reproductive stage of development and increased plant height have been shown to increase biomass productivity in grasses. The goal of this project was to detect quantitative trait loci using extremes from a large synthetic population, as well as a related recombinant inbred line mapping population for these two traits. Ten thousand individuals from a B73 × Mo17 noninbred population intermated for 14 generations (IBM Syn14) were grown at a density of approximately 16,500 plants ha−1. Flowering time and plant height were measured within this population. DNA was pooled from the 46 most extreme individuals from each distributional tail for each of the traits measured and used in bulk segregant analysis (BSA) sequencing. Allelic divergence at each of the ∼1.1 million SNP loci was estimated as the difference in allele frequencies between the selected extremes. Additionally, 224 intermated B73 × Mo17 recombinant inbred lines were concomitantly grown at a similar density adjacent to the large synthetic population and were assessed for flowering time and plant height. Using the BSA sequencing method, 14 and 13 genomic regions were identified for flowering time and plant height, respectively. Linkage mapping with the RIL population identified eight and three regions for flowering time and plant height, respectively. Of the regions identified, three colocalized between the two populations for flowering time and two colocalized for plant height. This study demonstrates the utility of using BSA sequencing for the dissection of complex quantitative traits important for production of lignocellulosic ethanol.
G3: Genes, Genomes, Genetics | 2015
Aaron J. Lorenz; Timothy M. Beissinger; Renato Rodrigues Silva; Natalia de Leon
Maize silage is forage of high quality and yield, and represents the second most important use of maize in the United States. The Wisconsin Quality Synthetic (WQS) maize population has undergone five cycles of recurrent selection for silage yield and composition, resulting in a genetically improved population. The application of high-density molecular markers allows breeders and geneticists to identify important loci through association analysis and selection mapping, as well as to monitor changes in the distribution of genetic diversity across the genome. The objectives of this study were to identify loci controlling variation for maize silage traits through association analysis and the assessment of selection signatures and to describe changes in the genomic distribution of gene diversity through selection and genetic drift in the WQS recurrent selection program. We failed to find any significant marker-trait associations using the historical phenotypic data from WQS breeding trials combined with 17,719 high-quality, informative single nucleotide polymorphisms. Likewise, no strong genomic signatures were left by selection on silage yield and quality in the WQS despite genetic gain for these traits. These results could be due to the genetic complexity underlying these traits, or the role of selection on standing genetic variation. Variation in loss of diversity through drift was observed across the genome. Some large regions experienced much greater loss in diversity than what is expected, suggesting limited recombination combined with small populations in recurrent selection programs could easily lead to fixation of large swaths of the genome.
Heredity | 2016
Timothy M. Beissinger; M Gholami; Malena Erbe; Steffen Weigend; Annett Weigend; N de Leon; Daniel Gianola; Henner Simianer
A whole-genome scan for identifying selection acting on pairs of linked loci is proposed and implemented. The scan is based on , one of Ohta’s 1982 measures of between-population linkage disequilibrium (LD). An approximate empirical null distribution for the statistic is suggested. Although the partitioning of LD into between-population components was originally used to investigate epistatic selection, we demonstrate that values of may also be influenced by single-locus selective sweeps with linkage but no epistasis. The proposed scan is implemented in a diverse panel of chickens including 72 distinct breeds genotyped at 538 298 single-nucleotide polymorphisms. In all, 1723 locus pairs are identified as putatively corresponding to a selective sweep or epistatic selection. These pairs of loci generally cluster to form overlapping or neighboring signals of selection. Known variants that were expected to have been under selection in the panel are identified, as well as an assortment of novel regions that have putatively been under selection in chickens. Notably, a promising pair of genes located 8 MB apart on chromosome 9 are identified based on as demonstrating strong evidence of dispersive epistatic selection between populations.