Xuecai Zhang
International Maize and Wheat Improvement Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xuecai Zhang.
Heredity | 2014
José Crossa; Paulino Pérez; John Hickey; Juan Burgueño; Leonardo Ornella; J. Jesus Céron-Rojas; Xuecai Zhang; Susanne Dreisigacker; Raman Babu; Yongle Li; David Bonnett; Ky L. Mathews
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype × environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center’s (CIMMYT’s) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT’s maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related bi-parental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.
Heredity | 2015
Xuecai Zhang; Paulino Pérez-Rodríguez; Kassa Semagn; Yoseph Beyene; Raman Babu; M A López-Cruz; F. M. San Vicente; Michael Olsen; Edward S. Buckler; J-L Jannink; Boddupalli M. Prasanna; José Crossa
One of the most important applications of genomic selection in maize breeding is to predict and identify the best untested lines from biparental populations, when the training and validation sets are derived from the same cross. Nineteen tropical maize biparental populations evaluated in multienvironment trials were used in this study to assess prediction accuracy of different quantitative traits using low-density (~200 markers) and genotyping-by-sequencing (GBS) single-nucleotide polymorphisms (SNPs), respectively. An extension of the Genomic Best Linear Unbiased Predictor that incorporates genotype × environment (GE) interaction was used to predict genotypic values; cross-validation methods were applied to quantify prediction accuracy. Our results showed that: (1) low-density SNPs (~200 markers) were largely sufficient to get good prediction in biparental maize populations for simple traits with moderate-to-high heritability, but GBS outperformed low-density SNPs for complex traits and simple traits evaluated under stress conditions with low-to-moderate heritability; (2) heritability and genetic architecture of target traits affected prediction performance, prediction accuracy of complex traits (grain yield) were consistently lower than those of simple traits (anthesis date and plant height) and prediction accuracy under stress conditions was consistently lower and more variable than under well-watered conditions for all the target traits because of their poor heritability under stress conditions; and (3) the prediction accuracy of GE models was found to be superior to that of non-GE models for complex traits and marginal for simple traits.
Trends in Plant Science | 2017
José Crossa; Paulino Pérez-Rodríguez; Jaime Cuevas; Osval A. Montesinos-López; Diego Jarquin; Gustavo de los Campos; Juan Burgueño; Juan Manuel González-Camacho; Sergio Pérez-Elizalde; Yoseph Beyene; Susanne Dreisigacker; Ravi P. Singh; Xuecai Zhang; Manje Gowda; Manish Roorkiwal; Jessica Rutkoski; Rajeev K. Varshney
Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding.
Heredity | 2014
Leonardo Ornella; Paulino Pérez; Elizabeth Tapia; Juan Manuel González-Camacho; Juan Burgueño; Xuecai Zhang; Sukhwinder Singh; Felix San Vicente; David Bonnett; Susanne Dreisigacker; Ravi P. Singh; N Long; José Crossa
Pearson’s correlation coefficient (ρ) is the most commonly reported metric of the success of prediction in genomic selection (GS). However, in real breeding ρ may not be very useful for assessing the quality of the regression in the tails of the distribution, where individuals are chosen for selection. This research used 14 maize and 16 wheat data sets with different trait–environment combinations. Six different models were evaluated by means of a cross-validation scheme (50 random partitions each, with 90% of the individuals in the training set and 10% in the testing set). The predictive accuracy of these algorithms for selecting individuals belonging to the best α=10, 15, 20, 25, 30, 35, 40% of the distribution was estimated using Cohen’s kappa coefficient (κ) and an ad hoc measure, which we call relative efficiency (RE), which indicates the expected genetic gain due to selection when individuals are selected based on GS exclusively. We put special emphasis on the analysis for α=15%, because it is a percentile commonly used in plant breeding programmes (for example, at CIMMYT). We also used ρ as a criterion for overall success. The algorithms used were: Bayesian LASSO (BL), Ridge Regression (RR), Reproducing Kernel Hilbert Spaces (RHKS), Random Forest Regression (RFR), and Support Vector Regression (SVR) with linear (lin) and Gaussian kernels (rbf). The performance of regression methods for selecting the best individuals was compared with that of three supervised classification algorithms: Random Forest Classification (RFC) and Support Vector Classification (SVC) with linear (lin) and Gaussian (rbf) kernels. Classification methods were evaluated using the same cross-validation scheme but with the response vector of the original training sets dichotomised using a given threshold. For α=15%, SVC-lin presented the highest κ coefficients in 13 of the 14 maize data sets, with best values ranging from 0.131 to 0.722 (statistically significant in 9 data sets) and the best RE in the same 13 data sets, with values ranging from 0.393 to 0.948 (statistically significant in 12 data sets). RR produced the best mean for both κ and RE in one data set (0.148 and 0.381, respectively). Regarding the wheat data sets, SVC-lin presented the best κ in 12 of the 16 data sets, with outcomes ranging from 0.280 to 0.580 (statistically significant in 4 data sets) and the best RE in 9 data sets ranging from 0.484 to 0.821 (statistically significant in 5 data sets). SVC-rbf (0.235), RR (0.265) and RHKS (0.422) gave the best κ in one data set each, while RHKS and BL tied for the last one (0.234). Finally, BL presented the best RE in two data sets (0.738 and 0.750), RFR (0.636) and SVC-rbf (0.617) in one and RHKS in the remaining three (0.502, 0.458 and 0.586). The difference between the performance of SVC-lin and that of the rest of the models was not so pronounced at higher percentiles of the distribution. The behaviour of regression and classification algorithms varied markedly when selection was done at different thresholds, that is, κ and RE for each algorithm depended strongly on the selection percentile. Based on the results, we propose classification method as a promising alternative for GS in plant breeding.
Theoretical and Applied Genetics | 2016
Yongsheng Wu; Felix San Vicente; Kaijian Huang; Thanda Dhliwayo; Denise E. Costich; Kassa Semagn; Nair Sudha; Michael Olsen; Boddupalli M. Prasanna; Xuecai Zhang; Raman Babu
Key messageMolecular characterization information on genetic diversity, population structure and genetic relationships provided by this research will help maize breeders to better understand how to utilize the current CML collection.AbstractCIMMYT maize inbred lines (CMLs) have been widely used all over the world and have contributed greatly to both tropical and temperate maize improvement. Genetic diversity and population structure of the current CML collection and of six temperate inbred lines were assessed and relationships among all lines were determined with genotyping-by-sequencing SNPs. Results indicated that: (1) wider genetic distance and low kinship coefficients among most pairs of lines reflected the uniqueness of most lines in the current CML collection; (2) the population structure and genetic divergence between the Temperate subgroup and Tropical subgroups were clear; three major environmental adaptation groups (Lowland Tropical, Subtropical/Mid-altitude and Highland Tropical subgroups) were clearly present in the current CML collection; (3) the genetic diversity of the three Tropical subgroups was similar and greater than that of the Temperate subgroup; the average genetic distance between the Temperate and Tropical subgroups was greater than among Tropical subgroups; and (4) heterotic patterns in each environmental adaptation group estimated using GBS SNPs were only partially consistent with patterns estimated based on combining ability tests and pedigree information. Combining current heterotic information based on combining ability tests and the genetic relationships inferred from molecular marker analyses may be the best strategy to define heterotic groups for future tropical maize improvement. Information resulting from this research will help breeders to better understand how to utilize all the CMLs to select parental lines, replace testers, assign heterotic groups and create a core set of breeding germplasm.
G3: Genes, Genomes, Genetics | 2017
Xuecai Zhang; Paulino Pérez-Rodríguez; Juan Burgueño; Michael Olsen; Edward S. Buckler; Gary N. Atlin; Boddupalli M. Prasanna; Mateo Vargas; Felix San Vicente; José Crossa
Genomic selection (GS) increases genetic gain by reducing the length of the selection cycle, as has been exemplified in maize using rapid cycling recombination of biparental populations. However, no results of GS applied to maize multi-parental populations have been reported so far. This study is the first to show realized genetic gains of rapid cycling genomic selection (RCGS) for four recombination cycles in a multi-parental tropical maize population. Eighteen elite tropical maize lines were intercrossed twice, and self-pollinated once, to form the cycle 0 (C0) training population. A total of 1000 ear-to-row C0 families was genotyped with 955,690 genotyping-by-sequencing SNP markers; their testcrosses were phenotyped at four optimal locations in Mexico to form the training population. Individuals from families with the best plant types, maturity, and grain yield were selected and intermated to form RCGS cycle 1 (C1). Predictions of the genotyped individuals forming cycle C1 were made, and the best predicted grain yielders were selected as parents of C2; this was repeated for more cycles (C2, C3, and C4), thereby achieving two cycles per year. Multi-environment trials of individuals from populations C0, C1, C2, C3, and C4, together with four benchmark checks were evaluated at two locations in Mexico. Results indicated that realized grain yield from C1 to C4 reached 0.225 ton ha−1 per cycle, which is equivalent to 0.100 ton ha−1 yr−1 over a 4.5-yr breeding period from the initial cross to the last cycle. Compared with the original 18 parents used to form cycle 0 (C0), genetic diversity narrowed only slightly during the last GS cycles (C3 and C4). Results indicate that, in tropical maize multi-parental breeding populations, RCGS can be an effective breeding strategy for simultaneously conserving genetic diversity and achieving high genetic gains in a short period of time.
PLOS ONE | 2016
Samuel Trachsel; Dapeng Sun; Felix M. SanVicente; Hongjian Zheng; Gary N. Atlin; Edgar Antonio Suarez; Raman Babu; Xuecai Zhang
We aimed to identify quantitative trait loci (QTL) for secondary traits related to grain yield (GY) in two BC1F2:3 backcross populations (LPSpop and DTPpop) under well-watered (4 environments; WW) and drought stressed (6; DS) conditions to facilitate breeding efforts towards drought tolerant maize. GY reached 5.6 and 5.8 t/ha under WW in the LPSpop and the DTPpop, respectively. Under DS, grain yield was reduced by 65% (LPSpop) to 59% (DTPpop) relative to WW. GY was strongly associated with the normalized vegetative index (NDVI; r ranging from 0.61 to 0.96) across environmental conditions and with an early flowering under drought stressed conditions (r ranging from -0.18 to -0.25) indicative of the importance of early vigor and drought escape for GY. Out of the 105 detected QTL, 53 were overdominant indicative of strong heterosis. For 14 out of 18 detected vigor QTL, as well as for eight flowering time QTL the trait increasing allele was derived from CML491. Collocations of early vigor QTL with QTL for stay green (bin 2.02, WW, LPSpop; 2.07, DS, DTPpop), the number of ears per plant (bins 2.02, 2.05, WW, LPSpop; 5.02, DS, LPSpop) and GY (bin 2.07, WW, DTPpop; 5.04, WW, LPSpop), reinforce the importance of the observed correlations. LOD scores for early vigor QTL in these bins ranged from 2.2 to 11.25 explaining 4.6 (additivity: +0.28) to 19.9% (additivity: +0.49) of the observed phenotypic variance. A strong flowering QTL was detected in bin 2.06 across populations and environmental conditions explaining 26–31.3% of the observed phenotypic variation (LOD: 13–17; additivity: 0.1–0.6d). Improving drought tolerance while at the same time maintaining yield potential could be achieved by combining alleles conferring early vigor from the recurrent parent with alleles advancing flowering from the donor. Additionally bin 8.06 (DTPpop) harbored a QTL for GY under WW (additivity: 0.27 t/ha) and DS (additivity: 0.58 t/ha). R2 ranged from 0 (DTPpop, WW) to 26.54% (LPSpop, DS) for NDVI, 18.6 (LPSpop, WW) to 42.45% (LPSpop, DS) for anthesis and from 0 (DTPpop, DS) to 24.83% (LPSpop, WW) for GY. Lines out-yielding the best check by 32.5% (DTPpop, WW) to 60% (DTPpop, DS) for all population-by-irrigation treatment combination (except LPSpop, WW) identified are immediately available for the use by breeders.
The Plant Genome | 2017
Shiliang Cao; Alexander Loladze; Yibing Yuan; Yongsheng Wu; Ao Zhang; Jiafa Chen; G.M. Huestis; Jingsheng Cao; V. Chaikam; Michael Olsen; Boddupalli M. Prasanna; F.M. San Vicente; Xuecai Zhang
Association and linkage mapping are effective for dissecting genetic architecture of complex traits in maize. TSC resistance in maize is controlled by a major QTL and several minor QTL. Major QTL on bin 8.03 confirmed by association and linkage mapping. TSC resistance in tropical maize could be improved by MAS and GS individually or stepwise.
Frontiers in Plant Science | 2017
Ao Zhang; Hongwu Wang; Yoseph Beyene; Kassa Semagn; Yubo Liu; Shiliang Cao; Zhenhai Cui; Yanye Ruan; Juan Burgueño; Felix San Vicente; Michael Olsen; Boddupalli M. Prasanna; José Crossa; Haiqiu Yu; Xuecai Zhang
Genomic selection is being used increasingly in plant breeding to accelerate genetic gain per unit time. One of the most important applications of genomic selection in maize breeding is to predict and select the best un-phenotyped lines in bi-parental populations based on genomic estimated breeding values. In the present study, 22 bi-parental tropical maize populations genotyped with low density SNPs were used to evaluate the genomic prediction accuracy (rMG) of the six trait-environment combinations under various levels of training population size (TPS) and marker density (MD), and assess the effect of trait heritability (h2), TPS and MD on rMG estimation. Our results showed that: (1) moderate rMG values were obtained for different trait-environment combinations, when 50% of the total genotypes was used as training population and ~200 SNPs were used for prediction; (2) rMG increased with an increase in h2, TPS and MD, both correlation and variance analyses showed that h2 is the most important factor and MD is the least important factor on rMG estimation for most of the trait-environment combinations; (3) predictions between pairwise half-sib populations showed that the rMG values for all the six trait-environment combinations were centered around zero, 49% predictions had rMG values above zero; (4) the trend observed in rMG differed with the trend observed in rMG/h, and h is the square root of heritability of the predicted trait, it indicated that both rMG and rMG/h values should be presented in GS study to show the accuracy of genomic selection and the relative accuracy of genomic selection compared with phenotypic selection, respectively. This study provides useful information to maize breeders to design genomic selection workflow in their breeding programs.
Frontiers in Plant Science | 2018
Diego Cerrudo; Shiliang Cao; Yibing Yuan; Carlos Martinez; Edgar Antonio Suarez; Raman Babu; Xuecai Zhang; Samuel Trachsel
To increase genetic gain for tolerance to drought, we aimed to identify environmentally stable QTL in per se and testcross combination under well-watered (WW) and drought stressed (DS) conditions and evaluate the possible deployment of QTL using marker assisted and/or genomic selection (QTL/GS-MAS). A total of 169 doubled haploid lines derived from the cross between CML495 and LPSC7F64 and 190 testcrosses (tester CML494) were evaluated in a total of 11 treatment-by-population combinations under WW and DS conditions. In response to DS, grain yield (GY) and plant height (PHT) were reduced while time to anthesis and the anthesis silking interval (ASI) increased for both lines and hybrids. Forty-eight QTL were detected for a total of nine traits. The allele derived from CML495 generally increased trait values for anthesis, ASI, PHT, the normalized difference vegetative index (NDVI) and the green leaf area duration (GLAD; a composite trait of NDVI, PHT and senescence) while it reduced trait values for leaf rolling and senescence. The LOD scores for all detected QTL ranged from 2.0 to 7.2 explaining 4.4 to 19.4% of the observed phenotypic variance with R2 ranging from 0 (GY, DS, lines) to 37.3% (PHT, WW, lines). Prediction accuracy of the model used for genomic selection was generally higher than phenotypic variance explained by the sum of QTL for individual traits indicative of the polygenic control of traits evaluated here. We therefore propose to use QTL-MAS in forward breeding to enrich the allelic frequency for a few desired traits with strong additive QTL in early selection cycles while GS-MAS could be used in more mature breeding programs to additionally capture alleles with smaller additive effects.