Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yong Q. Gu is active.

Publication


Featured researches published by Yong Q. Gu.


Proceedings of the National Academy of Sciences of the United States of America | 2013

A 4-gigabase physical map unlocks the structure and evolution of the complex genome of Aegilops tauschii, the wheat D-genome progenitor

Ming-Cheng Luo; Yong Q. Gu; Frank M. You; Karin R. Deal; Yaqin Ma; Yuqin Hu; Naxin Huo; Yi Wang; Ji-Rui Wang; Shiyong Chen; Chad M. Jorgensen; Yong Zhang; Patrick E. McGuire; Shiran Pasternak; Joshua C. Stein; Doreen Ware; Melissa Kramer; W. Richard McCombie; Shahryar F. Kianian; Mihaela Martis; Klaus F. X. Mayer; Sunish K. Sehgal; Wanlong Li; Bikram S. Gill; Michael W. Bevan; Hana Šimková; Jaroslav Doležel; Song Weining; Gerard R. Lazo; Olin D. Anderson

The current limitations in genome sequencing technology require the construction of physical maps for high-quality draft sequences of large plant genomes, such as that of Aegilops tauschii, the wheat D-genome progenitor. To construct a physical map of the Ae. tauschii genome, we fingerprinted 461,706 bacterial artificial chromosome clones, assembled contigs, designed a 10K Ae. tauschii Infinium SNP array, constructed a 7,185-marker genetic map, and anchored on the map contigs totaling 4.03 Gb. Using whole genome shotgun reads, we extended the SNP marker sequences and found 17,093 genes and gene fragments. We showed that collinearity of the Ae. tauschii genes with Brachypodium distachyon, rice, and sorghum decreased with phylogenetic distance and that structural genome evolution rates have been high across all investigated lineages in subfamily Pooideae, including that of Brachypodieae. We obtained additional information about the evolution of the seven Triticeae chromosomes from 12 ancestral chromosomes and uncovered a pattern of centromere inactivation accompanying nested chromosome insertions in grasses. We showed that the density of noncollinear genes along the Ae. tauschii chromosomes positively correlates with recombination rates, suggested a cause, and showed that new genes, exemplified by disease resistance genes, are preferentially located in high-recombination chromosome regions.


BMC Genomics | 2011

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

Frank M. You; Naxin Huo; Karin R. Deal; Yong Q. Gu; Ming-Cheng Luo; Patrick E. McGuire; Jan Dvorak; Olin D. Anderson

BackgroundMany plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence.ResultsAn annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii- the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated.ConclusionAn annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml).


BMC Plant Biology | 2009

Development of SSR markers and analysis of diversity in Turkish populations of Brachypodium distachyon

John P. Vogel; Metin Tuna; Hikmet Budak; Naxin Huo; Yong Q. Gu; Michael A Steinwand

BackgroundBrachypodium distachyon (Brachypodium) is rapidly emerging as a powerful model system to facilitate research aimed at improving grass crops for grain, forage and energy production. To characterize the natural diversity of Brachypodium and provide a valuable new tool to the growing list of resources available to Brachypodium researchers, we created and characterized a large, diverse collection of inbred lines.ResultsWe developed 84 inbred lines from eight locations in Turkey. To enable genotypic characterization of this collection, we created 398 SSR markers from BAC end and EST sequences. An analysis of 187 diploid lines from 56 locations with 43 SSR markers showed considerable genotypic diversity. There was some correlation between SSR genotypes and broad geographic regions, but there was also a high level of genotypic diversity at individual locations. Phenotypic analysis of this new germplasm resource revealed considerable variation in flowering time, seed size, and plant architecture. The inbreeding nature of Brachypodium was confirmed by an extremely high level of homozygosity in wild plants and a lack of cross-pollination under laboratory conditions.ConclusionTaken together, the inbreeding nature and genotypic diversity observed at individual locations suggest a significant amount of long-distance seed dispersal. The resources developed in this study are freely available to the research community and will facilitate experimental applications based on natural diversity.


BMC Genomics | 2010

Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes.

Eduard D. Akhunov; Alina Akhunova; Olin D. Anderson; James A. Anderson; N. K. Blake; Michael T. Clegg; Devin Coleman-Derr; Emily J. Conley; Curt Crossman; Karin R. Deal; Jorge Dubcovsky; Bikram S. Gill; Yong Q. Gu; Jakub Hadam; Hwa-Young Heo; Naxin Huo; Gerard R. Lazo; Ming-Cheng Luo; Yaqin Q. Ma; David E. Matthews; Patrick E. McGuire; Peter L. Morrell; Calvin O. Qualset; James Renfro; Dindo Tabanao; L. E. Talbert; Chao Tian; Donna M. Toleno; Marilyn L. Warburton; Frank M. You

BackgroundA genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat.ResultsNucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed.ConclusionsIn a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.


BMC Research Notes | 2008

The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes

Esteban Bortiri; Devin Coleman-Derr; Gerard R. Lazo; Olin D. Anderson; Yong Q. Gu

BackgroundWheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology.FindingsThe chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species.ConclusionWe demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.


BMC Genomics | 2009

A BAC-based physical map of Brachypodium distachyon and its comparative analysis with rice and wheat

Yong Q. Gu; Yaqin Ma; Naxin Huo; John P. Vogel; Frank M. You; Gerard R. Lazo; William Nelson; Carol Soderlund; Jan Dvorak; Olin D. Anderson; Ming-Cheng Luo

BackgroundBrachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodiu m as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence.ResultsA total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics.ConclusionThe construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at http://phymap.ucdavis.edu/brachypodium/.


BMC Genomics | 2012

Physical mapping resources for large plant genomes: radiation hybrids for wheat D-genome progenitor Aegilops tauschii

Ajay Kumar; Kristin Simons; Muhammad J. Iqbal; Monika Michalak de Jiménez; Filippo M. Bassi; Farhad Ghavami; Omar Al-Azzam; Thomas Drader; Yi Wang; Ming-Cheng Luo; Yong Q. Gu; Anne M. Denton; Gerard R. Lazo; Steven S. Xu; Jan Dvorak; Penny M.A. Kianian; Shahryar F. Kianian

BackgroundDevelopment of a high quality reference sequence is a daunting task in crops like wheat with large (~17Gb), highly repetitive (>80%) and polyploid genome. To achieve complete sequence assembly of such genomes, development of a high quality physical map is a necessary first step. However, due to the lack of recombination in certain regions of the chromosomes, genetic mapping, which uses recombination frequency to map marker loci, alone is not sufficient to develop high quality marker scaffolds for a sequence ready physical map. Radiation hybrid (RH) mapping, which uses radiation induced chromosomal breaks, has proven to be a successful approach for developing marker scaffolds for sequence assembly in animal systems. Here, the development and characterization of a RH panel for the mapping of D-genome of wheat progenitor Aegilops tauschii is reported.ResultsRadiation dosages of 350 and 450 Gy were optimized for seed irradiation of a synthetic hexaploid (AABBDD) wheat with the D-genome of Ae. tauschii accession AL8/78. The surviving plants after irradiation were crossed to durum wheat (AABB), to produce pentaploid RH1s (AABBD), which allows the simultaneous mapping of the whole D-genome. A panel of 1,510 RH1 plants was obtained, of which 592 plants were generated from the mature RH1 seeds, and 918 plants were rescued through embryo culture due to poor germination (<3%) of mature RH1 seeds. This panel showed a homogenous marker loss (2.1%) after screening with SSR markers uniformly covering all the D-genome chromosomes. Different marker systems mostly detected different lines with deletions. Using markers covering known distances, the mapping resolution of this RH panel was estimated to be <140kb. Analysis of only 16 RH lines carrying deletions on chromosome 2D resulted in a physical map with cM/cR ratio of 1:5.2 and 15 distinct bins. Additionally, with this small set of lines, almost all the tested ESTs could be mapped. A set of 399 most informative RH lines with an average deletion frequency of ~10% were identified for developing high density marker scaffolds of the D-genome.ConclusionsThe RH panel reported here is the first developed for any wild ancestor of a major cultivated plant species. The results provided insight into various aspects of RH mapping in plants, including the genetically effective cell number for wheat (for the first time) and the potential implementation of this technique in other plant species. This RH panel will be an invaluable resource for mapping gene based markers, developing a complete marker scaffold for the whole genome sequence assembly, fine mapping of markers and functional characterization of genes and gene networks present on the D-genome.


Nature | 2017

Genome sequence of the progenitor of the wheat D genome Aegilops tauschii

Ming-Cheng Luo; Yong Q. Gu; Daniela Puiu; Hao Wang; Sven O. Twardziok; Karin R. Deal; Naxin Huo; Tingting Zhu; Le Wang; Yi Wang; Patrick E. McGuire; Shuyang Liu; Hai Long; Ramesh K. Ramasamy; Juan C. Rodriguez; L. Van Sonny; Luxia Yuan; Zhenzhong Wang; Zhiqiang Xia; Lichan Xiao; Olin D. Anderson; Shuhong Ouyang; Yong Liang; Aleksey V. Zimin; Geo Pertea; Peng Qi; Jeffrey L. Bennetzen; Xiongtao Dai; Matthew Dawson; Hans-Georg Müller

Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat (Triticum aestivum, genomes AABBDD) and an important genetic resource for wheat. The large size and highly repetitive nature of the Ae. tauschii genome has until now precluded the development of a reference-quality genome sequence. Here we use an array of advanced technologies, including ordered-clone genome sequencing, whole-genome shotgun sequencing, and BioNano optical genome mapping, to generate a reference-quality genome sequence for Ae. tauschii ssp. strangulata accession AL8/78, which is closely related to the wheat D genome. We show that compared to other sequenced plant genomes, including a much larger conifer genome, the Ae. tauschii genome contains unprecedented amounts of very similar repeated sequences. Our genome comparisons reveal that the Ae. tauschii genome has a greater number of dispersed duplicated genes than other sequenced genomes and its chromosomes have been structurally evolving an order of magnitude faster than those of other grass genomes. The decay of colinearity with other grass genomes correlates with recombination rates along chromosomes. We propose that the vast amounts of very similar repeated sequences cause frequent errors in recombination and lead to gene duplications and structural chromosome changes that drive fast genome evolution.


BMC Plant Biology | 2015

WheatExp: an RNA-seq expression database for polyploid wheat

Stephen Pearce; Hans Vazquez-Gross; Sayer Y. Herin; David L. Hane; Yi Wang; Yong Q. Gu; Jorge Dubcovsky

BackgroundFor functional genomics studies, it is important to understand the dynamic expression profiles of transcribed genes in different tissues, stages of development and in response to environmental stimuli. The proliferation in the use of next-generation sequencing technologies by the plant research community has led to the accumulation of large volumes of expression data. However, analysis of these datasets is complicated by the frequent occurrence of polyploidy among economically-important crop species. In addition, processing and analyzing such large volumes of sequence data is a technical and time-consuming task, limiting their application in functional genomics studies, particularly for smaller laboratories which lack access to high-powered computing infrastructure. Wheat is a good example of a young polyploid species with three similar genomes (97 % identical among homoeologous genes), rapidly accumulating RNA-seq datasets and a large research community.DescriptionWe present WheatExp, an expression database and visualization tool to analyze and compare homoeologue-specific transcript profiles across a broad range of tissues from different developmental stages in polyploid wheat. Beginning with publicly-available RNA-seq datasets, we developed a pipeline to distinguish between homoeologous transcripts from annotated genes in tetraploid and hexaploid wheat. Data from multiple studies is processed and compiled into a database which can be queried either by BLAST or by searching for a known gene of interest by name or functional domain. Expression data of multiple genes can be displayed side-by-side across all expression datasets providing immediate access to a comprehensive panel of expression data for specific subsets of wheat genes.ConclusionsThe development of a publicly accessible expression database hosted on the GrainGenes website - http://wheat.pw.usda.gov/WheatExp/ - coupled with a simple and readily-comparable visualization tool will empower the wheat research community to use RNA-seq data and to perform functional analyses of target genes. The presented expression data is homoeologue-specific allowing for the analysis of relative contributions from each genome to the overall expression of a gene, a critical consideration for breeding applications. Our approach can be expanded to other polyploid species by adjusting sequence mapping parameters according to the specific divergence of their genomes.


PLOS ONE | 2013

Insular Organization of Gene Space in Grass Genomes

Andrea Gottlieb; Hans-Georg Müller; Alicia N. Massa; Humphrey Wanjugi; Karin R. Deal; Frank M. You; Xiangyang Xu; Yong Q. Gu; Ming-Cheng Luo; Olin D. Anderson; Agnes P. Chan; Pablo D. Rabinowicz; Katrien M. Devos; Jan Dvorak

Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands “gene insulae” to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.

Collaboration


Dive into the Yong Q. Gu's collaboration.

Top Co-Authors

Avatar

Ming-Cheng Luo

University of California

View shared research outputs
Top Co-Authors

Avatar

Naxin Huo

United States Department of Agriculture

View shared research outputs
Top Co-Authors

Avatar

Olin D. Anderson

United States Department of Agriculture

View shared research outputs
Top Co-Authors

Avatar

Gerard R. Lazo

Agricultural Research Service

View shared research outputs
Top Co-Authors

Avatar

Jan Dvorak

University of California

View shared research outputs
Top Co-Authors

Avatar

Yi Wang

United States Department of Agriculture

View shared research outputs
Top Co-Authors

Avatar

Frank M. You

Agriculture and Agri-Food Canada

View shared research outputs
Top Co-Authors

Avatar

Karin R. Deal

University of California

View shared research outputs
Top Co-Authors

Avatar

Shahryar F. Kianian

Agricultural Research Service

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge