Kevin Stoffel
University of California, Davis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kevin Stoffel.
BMC Genomics | 2012
Hamid Ashrafi; Theresa Hill; Kevin Stoffel; Alexander Kozik; Jiqiang Yao; Sebastián Reyes Chin-Wo; Allen Van Deynze
BackgroundMolecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes.ResultsTwo pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80–120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins.ConclusionsBefore availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project.
PLOS ONE | 2012
Sung-Chur Sim; Allen Van Deynze; Kevin Stoffel; David S. Douches; Daniel G. Zarka; Martin W. Ganal; Roger T. Chetelat; Samuel F. Hutton; John W. Scott; Randolph G. Gardner; Dilip R. Panthee; Martha A. Mutschler; James R. Myers; David M. Francis
The effects of selection on genome variation were investigated and visualized in tomato using a high-density single nucleotide polymorphism (SNP) array. 7,720 SNPs were genotyped on a collection of 426 tomato accessions (410 inbreds and 16 hybrids) and over 97% of the markers were polymorphic in the entire collection. Principal component analysis (PCA) and pairwise estimates of F st supported that the inbred accessions represented seven sub-populations including processing, large-fruited fresh market, large-fruited vintage, cultivated cherry, landrace, wild cherry, and S. pimpinellifolium. Further divisions were found within both the contemporary processing and fresh market sub-populations. These sub-populations showed higher levels of genetic diversity relative to the vintage sub-population. The array provided a large number of polymorphic SNP markers across each sub-population, ranging from 3,159 in the vintage accessions to 6,234 in the cultivated cherry accessions. Visualization of minor allele frequency revealed regions of the genome that distinguished three representative sub-populations of cultivated tomato (processing, fresh market, and vintage), particularly on chromosomes 2, 4, 5, 6, and 11. The PCA loadings and F st outlier analysis between these three sub-populations identified a large number of candidate loci under positive selection on chromosomes 4, 5, and 11. The extent of linkage disequilibrium (LD) was examined within each chromosome for these sub-populations. LD decay varied between chromosomes and sub-populations, with large differences reflective of breeding history. For example, on chromosome 11, decay occurred over 0.8 cM for processing accessions and over 19.7 cM for fresh market accessions. The observed SNP variation and LD decay suggest that different patterns of genetic variation in cultivated tomato are due to introgression from wild species and selection for market specialization.
BMC Genomics | 2007
Allen Van Deynze; Kevin Stoffel; C. Robin Buell; Alexander Kozik; Jia Liu; Esther van der Knaap; David M. Francis
BackgroundTomato has excellent genetic and genomic resources including a broad set of Expressed Sequence Tag (EST) data and high-density genetic maps. In addition, emerging physical maps and bacterial artificial clone sequence data serve as template to investigate genetic variation within the cultivated germplasm pool with the goal to manipulate agriculturally important traits. Unfortunately, the nearly exclusive focus of resource development on interspecific populations for genetic analyses and diversity studies has left a void in our understanding of genotypic variation within tomato breeding programs that focus on intra-specific populations. We describe the results of a study to identify nucleotide variation within tomato breeding germplasm and mapping parents for a set of conserved single-copy ESTs that are orthologous between tomato and Arabidopsis.ResultsUsing a pooled sequencing strategy, 967 tomato transcripts were screened for polymorphism in 12 tomato lines. Although intron position was conserved, intron lengths were 2-fold larger in tomato than in Arabidopsis. A total of 1,487 single nucleotide polymorphisms and 282 insertion/deletions were identified, of which 579 and 206 were polymorphic in breeding germplasm, respectively. Fresh market and processing germplasm were clearly divergent, as were Solanum lycopersicum var. cerasiformae and Solanum pimpinellifolium, tomatos closest relatives. The polymorphisms identified serve as marker resources for tomato. The COS is also applicable to other Solanaceae crops.ConclusionsThe results from this research enabled significant progress towards bridging the gap between genetic and genomic resources developed for populations derived from wide crosses and those applicable to intra-specific crosses for breeding in tomato.
G3: Genes, Genomes, Genetics | 2013
Maria Jose Truco; Hamid Ashrafi; Alexander Kozik; Hans van Leeuwen; John E. Bowers; Sebastian Reyes Chin Wo; Kevin Stoffel; Huaqin Xu; Theresa Hill; Allen Van Deynze; Richard W. Michelmore
We have generated an ultra-high-density genetic map for lettuce, an economically important member of the Compositae, consisting of 12,842 unigenes (13,943 markers) mapped in 3696 genetic bins distributed over nine chromosomal linkage groups. Genomic DNA was hybridized to a custom Affymetrix oligonucleotide array containing 6.4 million features representing 35,628 unigenes of Lactuca spp. Segregation of single-position polymorphisms was analyzed using 213 F7:8 recombinant inbred lines that had been generated by crossing cultivated Lactuca sativa cv. Salinas and L. serriola acc. US96UC23, the wild progenitor species of L. sativa. The high level of replication of each allele in the recombinant inbred lines was exploited to identify single-position polymorphisms that were assigned to parental haplotypes. Marker information has been made available using GBrowse to facilitate access to the map. This map has been anchored to the previously published integrated map of lettuce providing candidate genes for multiple phenotypes. The high density of markers achieved in this ultradense map allowed syntenic studies between lettuce and Vitis vinifera as well as other plant species.
BMC Plant Biology | 2009
Allen Van Deynze; Kevin Stoffel; Mike Lee; Thea A. Wilkins; Alexander Kozik; Roy G. Cantrell; John Z. Yu; Russel J Kohel; David M. Stelly
BackgroundCultivated cotton is an annual fiber crop derived mainly from two perennial species, Gossypium hirsutum L. or upland cotton, and G. barbadense L., extra long-staple fiber Pima or Egyptian cotton. These two cultivated species are among five allotetraploid species presumably derived monophyletically between G. arboreum and G. raimondii. Genomic-based approaches have been hindered by the limited variation within species. Yet, population-based methods are being used for genome-wide introgression of novel alleles from G. mustelinum and G. tomentosum into G. hirsutum using combinations of backcrossing, selfing, and inter-mating. Recombinant inbred line populations between genetics standards TM-1, (G. hirsutum) × 3-79 (G. barbadense) have been developed to allow high-density genetic mapping of traits.ResultsThis paper describes a strategy to efficiently characterize genomic variation (SNPs and indels) within and among cotton species. Over 1000 SNPs from 270 loci and 279 indels from 92 loci segregating in G. hirsutum and G. barbadense were genotyped across a standard panel of 24 lines, 16 of which are elite cotton breeding lines and 8 mapping parents of populations from six cotton species. Over 200 loci were genetically mapped in a core mapping population derived from TM-1 and 3-79 and in G. hirsutum breeding germplasm.ConclusionIn this research, SNP and indel diversity is characterized for 270 single-copy polymorphic loci in cotton. A strategy for SNP discovery is defined to pre-screen loci for copy number and polymorphism. Our data indicate that the A and D genomes in both diploid and tetraploid cotton remain distinct from each such that paralogs can be distinguished. This research provides mapped DNA markers for intra-specific crosses and introgression of exotic germplasm in cotton.
The Plant Genome | 2012
John P. Hamilton; Sung Chur Sim; Kevin Stoffel; Allen Van Deynze; C. Robin Buell; David M. Francis
Plant breeding is enhanced by the availability of molecular markers for rapid screening and selection in populations. Identification of polymorphic loci in cultivated tomato (Solanum lycopersicum L.) has been hampered by limited genome sampling across cultivated types. Whole transcriptome sequencing of six accessions that span cultivated market classes was performed using sequencing by synthesis. A total of 291,915,037 quality filtered reads representing 17 Gb of sequence were generated. Assembly of the reads resulted in 30.6 to 34.9 Mb of sequence for each of the six accessions and provided representation of 55.3 to 59.6% of the predicted tomato gene set with a wide range of molecular function Gene Ontologies (GOs) represented. A computational pipeline was developed to identify single nucleotide polymorphisms (SNPs). When coupled with two Sanger‐derived expressed sequence tag datasets and a reference genome, 62,576 nonredundant putative SNPs in tomato were identified. The SNPs within the contigs were present within all of the GO molecular function categories. The computational pipeline had validation rates in SNP genotyping assays that ranged from 95 to 100%, and the utility of these SNPs for assessing genetic variation within cultivated and wild populations was demonstrated. Collectively, the transcript sequences and the annotated SNPs provide a resource to facilitate tomato genetics and breeding efforts.
PLOS ONE | 2013
Theresa Hill; Hamid Ashrafi; Sebastian Reyes-Chin-Wo; JiQiang Q. Yao; Kevin Stoffel; Maria Jose Truco; Alexander Kozik; Richard W. Michelmore; Allen Van Deynze
The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum.
BMC Genomics | 2012
Kevin Stoffel; Hans van Leeuwen; Alexander Kozik; David Caldwell; Hamid Ashrafi; Xinping Cui; Xiaoping Tan; Theresa Hill; Sebastian Reyes-Chin-Wo; Maria Jose Truco; Richard W. Michelmore; Allen Van Deynze
BackgroundHigh-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits utility in species with low rates of polymorphism such as lettuce (Lactuca sativa).ResultsWe developed a 6.5 million feature Affymetrix GeneChip® for efficient polymorphism discovery and genotyping, as well as for analysis of gene expression in lettuce. Probes on the microarray were designed from 26,809 unigenes from cultivated lettuce and an additional 8,819 unigenes from four related species (L. serriola, L. saligna, L. virosa and L. perennis). Where possible, probes were tiled with a 2 bp stagger, alternating on each DNA strand; providing an average of 187 probes covering approximately 600 bp for each of over 35,000 unigenes; resulting in up to 13 fold redundancy in coverage per nucleotide. We developed protocols for hybridization of genomic DNA to the GeneChip® and refined custom algorithms that utilized coverage from multiple, high quality probes to detect single position polymorphisms in 2 bp sliding windows across each unigene. This allowed us to detect greater than 18,000 polymorphisms between the parental lines of our core mapping population, as well as numerous polymorphisms between cultivated lettuce and wild species in the lettuce genepool. Using marker data from our diversity panel comprised of 52 accessions from the five species listed above, we were able to separate accessions by species using both phylogenetic and principal component analyses. Additionally, we estimated the diversity between different types of cultivated lettuce and distinguished morphological types.ConclusionBy hybridizing genomic DNA to a custom oligonucleotide array designed for maximum gene coverage, we were able to identify polymorphisms using two approaches for pair-wise comparisons, as well as a highly parallel method that compared all 52 genotypes simultaneously.
BMC Genomics | 2014
Amanda M. Hulse-Kemp; Hamid Ashrafi; Xiuting Zheng; Fei Wang; Kevin A. Hoegenauer; Andrea Bv Maeda; S Samuel Yang; Kevin Stoffel; Marta Matvienko; Kimberly Clemons; Allen Van Deynze; Don C. Jones; David M. Stelly
BackgroundCotton (Gossypium spp.) is the largest producer of natural fibers for textile and is an important crop worldwide. Crop production is comprised primarily of G. hirsutum L., an allotetraploid. However, elite cultivars express very small amounts of variation due to the species monophyletic origin, domestication and further bottlenecks due to selection. Conversely, wild cotton species harbor extensive genetic diversity of prospective utility to improve many beneficial agronomic traits, fiber characteristics, and resistance to disease and drought. Introgression of traits from wild species can provide a natural way to incorporate advantageous traits through breeding to generate higher-producing cotton cultivars and more sustainable production systems. Interspecific introgression efforts by conventional methods are very time-consuming and costly, but can be expedited using marker-assisted selection.ResultsUsing transcriptome sequencing we have developed the first gene-associated single nucleotide polymorphism (SNP) markers for wild cotton species G. tomentosum, G. mustelinum, G. armourianum and G. longicalyx. Markers were also developed for a secondary cultivated species G. barbadense cv. 3–79. A total of 62,832 non-redundant SNP markers were developed from the five wild species which can be utilized for interspecific germplasm introgression into cultivated G. hirsutum and are directly associated with genes. Over 500 of the G. barbadense markers have been validated by whole-genome radiation hybrid mapping. Overall 1,060 SNPs from the five different species have been screened and shown to produce acceptable genotyping assays.ConclusionsThis large set of 62,832 SNPs relative to cultivated G. hirsutum will allow for the first high-density mapping of genes from five wild species that affect traits of interest, including beneficial agronomic and fiber characteristics. Upon mapping, the markers can be utilized for marker-assisted introgression of new germplasm into cultivated cotton and in subsequent breeding of agronomically adapted types, including cultivar development.
G3: Genes, Genomes, Genetics | 2015
Amanda M. Hulse-Kemp; Hamid Ashrafi; Kevin Stoffel; Xiuting Zheng; Christopher A. Saski; Brian E. Scheffler; David D. Fang; Z. Jeffrey Chen; Allen Van Deynze; David M. Stelly
A bacterial artificial chromosome library and BAC-end sequences for cultivated cotton (Gossypium hirsutum L.) have recently been developed. This report presents genome-wide single nucleotide polymorphism (SNP) mining utilizing resequencing data with BAC-end sequences as a reference by alignment of 12 G. hirsutum L. lines, one G. barbadense L. line, and one G. longicalyx Hutch and Lee line. A total of 132,262 intraspecific SNPs have been developed for G. hirsutum, whereas 223,138 and 470,631 interspecific SNPs have been developed for G. barbadense and G. longicalyx, respectively. Using a set of interspecific SNPs, 11 randomly selected and 77 SNPs that are putatively associated with the homeologous chromosome pair 12 and 26, we mapped 77 SNPs into two linkage groups representing these chromosomes, spanning a total of 236.2 cM in an interspecific F2 population (G. barbadense 3-79 × G. hirsutum TM-1). The mapping results validated the approach for reliably producing large numbers of both intraspecific and interspecific SNPs aligned to BAC-ends. This will allow for future construction of high-density integrated physical and genetic maps for cotton and other complex polyploid genomes. The methods developed will allow for future Gossypium resequencing data to be automatically genotyped for identified SNPs along the BAC-end sequence reference for anchoring sequence assemblies and comparative studies.