Sharon Wei
Cold Spring Harbor Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sharon Wei.
Nucleic Acids Research | 2016
Paul J. Kersey; James E. Allen; Irina M. Armean; Sanjay Boddu; Bruce J. Bolt; Denise R. Carvalho-Silva; Mikkel Christensen; Paul Davis; Lee J. Falin; Christoph Grabmueller; Jay Humphrey; Arnaud Kerhornou; Julia Khobova; Naveen K. Aranganathan; Nicholas Langridge; Ernesto Lowy; Mark D. McDowall; Uma Maheswari; Michael Nuhn; Chuang Kee Ong; Bert Overduin; Michael Paulini; Helder Pedro; Emily Perry; Giulietta Spudich; Electra Tapanari; Brandon Walts; Gareth Williams; Marcela Tello–Ruiz; Joshua C. Stein
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.
Nucleic Acids Research | 2011
Ken Youens-Clark; Edward S. Buckler; Terry M. Casstevens; Charles Chen; Genevieve DeClerck; Paul S. Derwent; Palitha Dharmawardhana; Pankaj Jaiswal; Paul J. Kersey; A. S. Karthikeyan; Jerry Lu; Susan R. McCouch; Liya Ren; William Spooner; Joshua C. Stein; James Thomason; Sharon Wei; Doreen Ware
Now in its 10th year, the Gramene database (http://www.gramene.org) has grown from its primary focus on rice, the first fully-sequenced grass genome, to become a resource for major model and crop plants including Arabidopsis, Brachypodium, maize, sorghum, poplar and grape in addition to several species of rice. Gramene began with the addition of an Ensembl genome browser and has expanded in the last decade to become a robust resource for plant genomics hosting a wide array of data sets including quantitative trait loci (QTL), metabolic pathways, genetic diversity, genes, proteins, germplasm, literature, ontologies and a fully-structured markers and sequences database integrated with genome browsers and maps from various published studies (genetic, physical, bin, etc.). In addition, Gramene now hosts a variety of web services including a Distributed Annotation Server (DAS), BLAST and a public MySQL database. Twice a year, Gramene releases a major build of the database and makes interim releases to correct errors or to make important updates to software and/or data.
Nucleic Acids Research | 2014
Marcela K. Monaco; Joshua C. Stein; Sushma Naithani; Sharon Wei; Palitha Dharmawardhana; Sunita Kumari; Vindhya Amarasinghe; Ken Youens-Clark; James Thomason; Justin Preece; Shiran Pasternak; Andrew Olson; Yinping Jiao; Zhenyuan Lu; Daniel M. Bolser; Arnaud Kerhornou; Daniel M. Staines; Brandon Walts; Guanming Wu; Peter D'Eustachio; Robin Haw; David Croft; Paul J. Kersey; Lincoln Stein; Pankaj Jaiswal; Doreen Ware
Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology.
Nucleic Acids Research | 2016
Marcela K. Tello-Ruiz; Joshua C. Stein; Sharon Wei; Justin Preece; Andrew Olson; Sushma Naithani; Vindhya Amarasinghe; Palitha Dharmawardhana; Yinping Jiao; Joseph Mulvaney; Sunita Kumari; Kapeel Chougule; Justin Elser; Bo Wang; James Thomason; Daniel M. Bolser; Arnaud Kerhornou; Brandon Walts; Nuno A. Fonseca; Laura Huerta; Maria Keays; Y. Amy Tang; Helen Parkinson; Antonio Fabregat; Sheldon J. McKay; Joel Weiser; Peter D'Eustachio; Lincoln Stein; Robert Petryszak; Paul J. Kersey
Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBIs Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramenes archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.
Nucleic Acids Research | 2018
Paul J. Kersey; James E. Allen; Alexis Allot; Matthieu Barba; Sanjay Boddu; Bruce J. Bolt; Denise R. Carvalho-Silva; Mikkel Christensen; Paul Davis; Christoph Grabmueller; Navin Kumar; Zicheng Liu; Thomas Maurel; Ben Moore; Mark D. McDowall; Uma Maheswari; Guy Naamati; Victoria Newman; Chuang Kee Ong; Michael Paulini; Helder Pedro; Emily Perry; Matthew Russell; Helen Sparrow; Electra Tapanari; Kieron Taylor; Alessandro Vullo; Gareth Williams; Amonida Zadissia; Andrew Olson
Abstract Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.
Nature Genetics | 2018
Joshua C. Stein; Yeisoo Yu; Dario Copetti; Derrick J. Zwickl; Li Zhang; Chengjun Zhang; Kapeel Chougule; Dongying Gao; Aiko Iwata; Jose Luis Goicoechea; Sharon Wei; Jun Wang; Yi Liao; Muhua Wang; Julie Jacquemin; Claude Becker; Dave Kudrna; Jianwei Zhang; Carlos E.M. Londono; Xiang Song; Seunghee Lee; Paul Sanchez; Andrea Zuccolo; Jetty S. S. Ammiraju; Jayson Talag; Ann Danowitz; Luis F. Rivera; Andrea R. Gschwend; Christos Noutsos; Cheng Chieh Wu
The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young ‘AA’ subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 ‘Miracle Rice’, which relieved famine and drove the Green Revolution in Asia 50 years ago.Genome assemblies of 13 domesticated and wild rice relatives reveal salient features of genome evolution across the genus Oryza, especially rapid species diversification and turnover of transposons. This study also releases a complete long-read assembly of IR 8 ‘Miracle Rice’.
Methods of Molecular Biology | 2016
Marcela K. Tello-Ruiz; Joshua C. Stein; Sharon Wei; Ken Youens-Clark; Pankaj Jaiswal; Doreen Ware
Gramene is an integrated informatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for economically important and research model crops, including wheat, potato, tomato, banana, grape, poplar, and Chlamydomonas. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) view a phylogenetic tree for a family of transcription factors, (2) explore genetic variation in the orthologues of a gene with a known trait association, and (3) upload, visualize, and privately share end user data into a new genome browser track.Moreover, this is the first publication describing Gramenes new web interface-intended to provide a simplified portal to the most complete and up-to-date set of plant genome and pathway annotations.
Current Plant Biology | 2016
Parul Gupta; Sushma Naithani; Marcela K. Tello-Ruiz; Kapeel Chougule; Peter D’Eustachio; Antonio Fabregat; Yinping Jiao; Maria Keays; Young Koung Lee; Sunita Kumari; Joseph Mulvaney; Andrew Olson; Justin Preece; Joshua C. Stein; Sharon Wei; Joel Weiser; Laura Huerta; Robert Petryszak; Paul J. Kersey; Lincoln Stein; Doreen Ware; Pankaj Jaiswal
Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationships to enrich the annotation of genomic data and provides tools to perform powerful comparative analyses across a wide spectrum of plant species. It consists of an integrated portal for querying, visualizing and analyzing data for 44 plant reference genomes, genetic variation data sets for 12 species, expression data for 16 species, curated rice pathways and orthology-based pathway projections for 66 plant species including various crops. Here we briefly describe the functions and uses of the Gramene database.
Nucleic Acids Research | 2018
Marcela K. Tello-Ruiz; Sushma Naithani; Joshua C. Stein; Parul Gupta; Michael S. Campbell; Andrew Olson; Sharon Wei; Justin Preece; Matthew Geniza; Yinping Jiao; Young Koung Lee; Bo Wang; Joseph Mulvaney; Kapeel Chougule; Justin Elser; Noor Al-Bader; Sunita Kumari; James Thomason; Vivek Kumar; Daniel M. Bolser; Guy Naamati; Electra Tapanari; Nuno A. Fonseca; Laura Huerta; Haider Iqbal; Maria Keays; Alfonso Munoz-Pomer Fuentes; Amy Tang; Antonio Fabregat; Peter D’Eustachio
Abstract Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramenes Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene–gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
Nature Genetics | 2018
Joshua C. Stein; Yeisoo Yu; Dario Copetti; Derrick J. Zwickl; Li Zhang; Chengjun Zhang; Kapeel Chougule; Dongying Gao; Aiko Iwata; Jose Luis Goicoechea; Sharon Wei; Jun Wang; Yi Liao; Muhua Wang; Julie Jacquemin; Claude Becker; Dave Kudrna; Jianwei Zhang; Carlos E.M. Londono; Xiang Song; Seunghee Lee; Paul Sanchez; Andrea Zuccolo; Jetty S. S. Ammiraju; Jayson Talag; Ann Danowitz; Luis F. Rivera; Andrea R. Gschwend; Christos Noutsos; Cheng-chieh Wu
This article was not made open access when initially published online, which was corrected before print publication. In addition, ORCID links were missing for 12 authors and have been added to the HTML and PDF versions of the article.