Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jeffrey B. Endelman is active.

Publication


Featured researches published by Jeffrey B. Endelman.


The Plant Genome | 2011

Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP

Jeffrey B. Endelman

Many important traits in plant breeding are polygenic and therefore recalcitrant to traditional marker‐assisted selection. Genomic selection addresses this complexity by including all markers in the prediction model. A key method for the genomic prediction of breeding values is ridge regression (RR), which is equivalent to best linear unbiased prediction (BLUP) when the genetic covariance between lines is proportional to their similarity in genotype space. This additive model can be broadened to include epistatic effects by using other kernels, such as the Gaussian, which represent inner products in a complex feature space. To facilitate the use of RR and nonadditive kernels in plant breeding, a new software package for R called rrBLUP has been developed. At its core is a fast maximum‐likelihood algorithm for mixed models with a single variance component besides the residual error, which allows for efficient prediction with unreplicated training data. Use of the rrBLUP software is demonstrated through several examples, including the identification of optimal crosses based on superior progeny value. In cross‐validation tests, the prediction accuracy with nonadditive kernels was significantly higher than RR for wheat (Triticum aestivum L.) grain yield but equivalent for several maize (Zea mays L.) traits.


The Plant Genome | 2012

Genomic Selection in Wheat Breeding using Genotyping-by-Sequencing

Jesse Poland; Jeffrey B. Endelman; J. C. Dawson; Jessica Rutkoski; Shuangye Wu; Yann Manes; Susanne Dreisigacker; José Crossa; Héctor Sánchez-Villeda; Mark E. Sorrells; Jean-Luc Jannink

Genomic selection (GS) uses genomewide molecular markers to predict breeding values and make selections of individuals or breeding lines prior to phenotyping. Here we show that genotyping‐by‐sequencing (GBS) can be used for de novo genotyping of breeding panels and to develop accurate GS models, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS we discovered 41,371 single nucleotide polymorphisms (SNPs) in a set of 254 advanced breeding lines from CIMMYTs semiarid wheat breeding program. Four different methods were evaluated for imputing missing marker scores in this set of unmapped markers, including random forest regression and a newly developed multivariate‐normal expectation‐maximization algorithm, which gave more accurate imputation than heterozygous or mean imputation at the marker level, although no significant differences were observed in the accuracy of genomic‐estimated breeding values (GEBVs) among imputation methods. Genomic‐estimated breeding value prediction accuracies with GBS were 0.28 to 0.45 for grain yield, an improvement of 0.1 to 0.2 over an established marker platform for wheat. Genotyping‐by‐sequencing combines marker discovery and genotyping of large populations, making it an excellent marker platform for breeding applications even in the absence of a reference genome sequence or previous polymorphism discovery. In addition, the flexibility and low cost of GBS make this an ideal approach for genomics‐assisted breeding.


G3: Genes, Genomes, Genetics | 2012

Shrinkage estimation of the realized relationship matrix.

Jeffrey B. Endelman; Jean-Luc Jannink

The additive relationship matrix plays an important role in mixed model prediction of breeding values. For genotype matrix X (loci in columns), the product XX′ is widely used as a realized relationship matrix, but the scaling of this matrix is ambiguous. Our first objective was to derive a proper scaling such that the mean diagonal element equals 1+f, where f is the inbreeding coefficient of the current population. The result is a formula involving the covariance matrix for sampling genomic loci, which must be estimated with markers. Our second objective was to investigate whether shrinkage estimation of this covariance matrix can improve the accuracy of breeding value (GEBV) predictions with low-density markers. Using an analytical formula for shrinkage intensity that is optimal with respect to mean-squared error, simulations revealed that shrinkage can significantly increase GEBV accuracy in unstructured populations, but only for phenotyped lines; there was no benefit for unphenotyped lines. The accuracy gain from shrinkage increased with heritability, but at high heritability (> 0.6) this benefit was irrelevant because phenotypic accuracy was comparable. These trends were confirmed in a commercial pig population with progeny-test-estimated breeding values. For an anonymous trait where phenotypic accuracy was 0.58, shrinkage increased the average GEBV accuracy from 0.56 to 0.62 (SE < 0.00) when using random sets of 384 markers from a 60K array. We conclude that when moderate-accuracy phenotypes and low-density markers are available for the candidates of genomic selection, shrinkage estimation of the relationship matrix can improve genetic gain.


PLOS ONE | 2014

The USDA barley core collection: genetic diversity, population structure, and potential for genome-wide association studies.

María Muñoz-Amatriaín; Alfonso Cuesta-Marcos; Jeffrey B. Endelman; Jordi Comadran; John M. Bonman; Harold E. Bockelman; Shiaoman Chao; Joanne Russell; Robbie Waugh; Patrick M. Hayes; Gary J. Muehlbauer

New sources of genetic diversity must be incorporated into plant breeding programs if they are to continue increasing grain yield and quality, and tolerance to abiotic and biotic stresses. Germplasm collections provide a source of genetic and phenotypic diversity, but characterization of these resources is required to increase their utility for breeding programs. We used a barley SNP iSelect platform with 7,842 SNPs to genotype 2,417 barley accessions sampled from the USDA National Small Grains Collection of 33,176 accessions. Most of the accessions in this core collection are categorized as landraces or cultivars/breeding lines and were obtained from more than 100 countries. Both STRUCTURE and principal component analysis identified five major subpopulations within the core collection, mainly differentiated by geographical origin and spike row number (an inflorescence architecture trait). Different patterns of linkage disequilibrium (LD) were found across the barley genome and many regions of high LD contained traits involved in domestication and breeding selection. The genotype data were used to define ‘mini-core’ sets of accessions capturing the majority of the allelic diversity present in the core collection. These ‘mini-core’ sets can be used for evaluating traits that are difficult or expensive to score. Genome-wide association studies (GWAS) of ‘hull cover’, ‘spike row number’, and ‘heading date’ demonstrate the utility of the core collection for locating genetic factors determining important phenotypes. The GWAS results were referenced to a new barley consensus map containing 5,665 SNPs. Our results demonstrate that GWAS and high-density SNP genotyping are effective tools for plant breeders interested in accessing genetic diversity in large germplasm collections.


Genetics | 2013

Genomic Predictability of Interconnected Biparental Maize Populations

Christian Riedelsheimer; Jeffrey B. Endelman; Michael Stange; Mark E. Sorrells; Jean-Luc Jannink; Albrecht E. Melchinger

Intense structuring of plant breeding populations challenges the design of the training set (TS) in genomic selection (GS). An important open question is how the TS should be constructed from multiple related or unrelated small biparental families to predict progeny from individual crosses. Here, we used a set of five interconnected maize (Zea mays L.) populations of doubled-haploid (DH) lines derived from four parents to systematically investigate how the composition of the TS affects the prediction accuracy for lines from individual crosses. A total of 635 DH lines genotyped with 16,741 polymorphic SNPs were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. The populations showed a genomic similarity pattern, which reflects the crossing scheme with a clear separation of full sibs, half sibs, and unrelated groups. Prediction accuracies within full-sib families of DH lines followed closely theoretical expectations, accounting for the influence of sample size and heritability of the trait. Prediction accuracies declined by 42% if full-sib DH lines were replaced by half-sib DH lines, but statistically significantly better results could be achieved if half-sib DH lines were available from both instead of only one parent of the validation population. Once both parents of the validation population were represented in the TS, including more crosses with a constant TS size did not increase accuracies. Unrelated crosses showing opposite linkage phases with the validation population resulted in negative or reduced prediction accuracies, if used alone or in combination with related families, respectively. We suggest identifying and excluding such crosses from the TS. Moreover, the observed variability among populations and traits suggests that these uncertainties must be taken into account in models optimizing the allocation of resources in GS.


The Plant Genome | 2015

Assessing genomic selection prediction accuracy in a dynamic barley breeding population

Ahmad Sallam; Jeffrey B. Endelman; Jean-Luc Jannink; Kevin P. Smith

Prediction accuracy of genomic selection (GS) has been previously evaluated through simulation and cross‐validation; however, validation based on progeny performance in a plant breeding program has not been investigated thoroughly. We evaluated several prediction models in a dynamic barley breeding population comprised of 647 six‐row lines using four traits differing in genetic architecture and 1536 single nucleotide polymorphism (SNP) markers. The breeding lines were divided into six sets designated as one parent set and five consecutive progeny sets comprised of representative samples of breeding lines over a 5‐yr period. We used these data sets to investigate the effect of model and training population composition on prediction accuracy over time. We found little difference in prediction accuracy among the models confirming prior studies that found the simplest model, random regression best linear unbiased prediction (RR‐BLUP), to be accurate across a range of situations. In general, we found that using the parent set was sufficient to predict progeny sets with little to no gain in accuracy from generating larger training populations by combining the parent set with subsequent progeny sets. The prediction accuracy ranged from 0.03 to 0.99 across the four traits and five progeny sets. We explored characteristics of the training and validation populations (marker allele frequency, population structure, and linkage disequilibrium, LD) as well as characteristics of the trait (genetic architecture and heritability, H2). Fixation of markers associated with a trait over time was most clearly associated with reduced prediction accuracy for the mycotoxin trait DON. Higher trait H2 in the training population and simpler trait architecture were associated with greater prediction accuracy.


Bioinformatics | 2014

LPmerge: an R package for merging genetic maps by linear programming

Jeffrey B. Endelman; Christophe Plomion

UNLABELLED Consensus genetic maps constructed from multiple populations are an important resource for both basic and applied research, including genome-wide association analysis, genome sequence assembly and studies of evolution. The LPmerge software uses linear programming to efficiently minimize the mean absolute error between the consensus map and the linkage maps from each population. This minimization is performed subject to linear inequality constraints that ensure the ordering of the markers in the linkage maps is preserved. When marker order is inconsistent between linkage maps, a minimum set of ordinal constraints is deleted to resolve the conflicts. AVAILABILITY AND IMPLEMENTATION LPmerge is on CRAN at http://cran.r-project.org/web/packages/LPmerge.


The Plant Genome | 2016

Software for Genome-Wide Association Studies in Autopolyploids and Its Application to Potato

Umesh R. Rosyara; Walter De Jong; David S. Douches; Jeffrey B. Endelman

Genome‐wide association studies (GWAS) are widely used in diploid species to study complex traits in diversity and breeding populations, but GWAS software tailored to autopolyploids is lacking. The objectives of this research were to (i) develop an R package for autopolyploids based on the Q + K mixed model, (ii) validate the software with simulated data, and (iii) analyze a diversity panel of tetraploid potatoes. A unique feature of the R package, called GWASpoly, is its ability to model different types of polyploid gene action, including additive, simplex dominant, and duplex dominant. Using a simulated tetraploid population, we confirmed our hypothesis that statistical power is higher when the assumed gene action in the GWAS model matches the gene action at unobserved quantitative trait loci (QTL). Thirteen traits were analyzed in the Solanaceae Coordinated Agricultural Project (SolCAP) potato diversity panel and, consistent with previous studies, significant QTL for tuber shape and eye depth co‐localized on chromosome 10. For the other traits, only marginally significant QTL were detected, most likely due to insufficient statistical power: for simulated traits with a heritability (h2) of 0.3, the median genome‐wide power was only 0.01. Our results indicate that both marker density and population size were limiting factors for GWAS with the SolCAP panel.


Theoretical and Applied Genetics | 2016

Genetic mapping with an inbred line-derived F2 population in potato

Jeffrey B. Endelman; Shelley Jansky

Key messageThis is the first report of the production and use of a diploid inbred line-based F2 population for genetic mapping in potato.AbstractPotato (Solanum tuberosum L.) is an important global food crop, for which tetrasomic inheritance and self-incompatibility have limited both genetic discovery and breeding gains. We report here on the creation of the first diploid inbred line-derived F2 population in potato, and demonstrate its utility for genetic mapping. To create the population, the doubled monoploid potato DM1-3 was crossed as a female to M6, an S7 inbred line derived from the wild relative S. chacoense, and a single F1 plant was then self-pollinated. A genetic linkage map with 2264 single nucleotide polymorphisms was constructed and used to improve the physical anchoring of superscaffolds in the potato reference genome, which is based on DM1-3. Segregation was observed for skin and flesh color, skin and flesh pigment intensity, tuber shape, anther development, jelly end, and the presence of eye tubers instead of normal sprouts. Using the R/qtl software, we detected 10 genes, 7 of which have been previously mapped and 3 for which this is the first publication. The latter category includes tightly linked genes for the jelly end and eye tuber traits on chromosome 5. The development of recombinant inbred lines from this F2 population by single-seed descent is underway and should facilitate even better resolution of these and other loci.


BMC Genomics | 2014

Genome-wide distribution of genetic diversity and linkage disequilibrium in a mass-selected population of maritime pine

Christophe Plomion; Emilie Chancerel; Jeffrey B. Endelman; Jean-Baptiste Lamy; Eric Mandrou; Isabelle Lesur; François Ehrenmann; Fikret Isik; Marco C. A. M. Bink; Laurent Bouffier

BackgroundThe accessibility of high-throughput genotyping technologies has contributed greatly to the development of genomic resources in non-model organisms. High-density genotyping arrays have only recently been developed for some economically important species such as conifers. The potential for using genomic technologies in association mapping and breeding depends largely on the genome wide patterns of diversity and linkage disequilibrium in current breeding populations. This study aims to deepen our knowledge regarding these issues in maritime pine, the first species used for reforestation in south western Europe.ResultsUsing a new map merging algorithm, we first established a 1,712 cM composite linkage map (comprising 1,838 SNP markers in 12 linkage groups) by bringing together three already available genetic maps. Using rigorous statistical testing based on kernel density estimation and resampling we identified cold and hot spots of recombination. In parallel, 186 unrelated trees of a mass-selected population were genotyped using a 12k-SNP array. A total of 2,600 informative SNPs allowed to describe historical recombination, genetic diversity and genetic structure of this recently domesticated breeding pool that forms the basis of much of the current and future breeding of this species. We observe very low levels of population genetic structure and find no evidence that artificial selection has caused a reduction in genetic diversity. By combining these two pieces of information, we provided the map position of 1,671 SNPs corresponding to 1,192 different loci. This made it possible to analyze the spatial pattern of genetic diversity (He) and long distance linkage disequilibrium (LD) along the chromosomes. We found no particular pattern in the empirical variogram of He across the 12 linkage groups and, as expected for an outcrossing species with large effective population size, we observed an almost complete lack of long distance LD.ConclusionsThese results are a stepping stone for the development of strategies for studies in population genomics, association mapping and genomic prediction in this economical and ecologically important forest tree species.

Collaboration


Dive into the Jeffrey B. Endelman's collaboration.

Top Co-Authors

Avatar

Asunta L. Thompson

North Dakota State University

View shared research outputs
Top Co-Authors

Avatar

David G. Holm

Colorado State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

R. G. Novy

Agricultural Research Service

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jiwan P. Palta

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Paul C. Bethke

Agricultural Research Service

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge