Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Valerie Schneider is active.

Publication


Featured researches published by Valerie Schneider.


PLOS Biology | 2011

Modernizing Reference Genome Assemblies

Deanna M. Church; Valerie Schneider; Tina Graves; Katherine Auger; Fiona Cunningham; Nathan Bouk; Hsiu Chuan Chen; Richa Agarwala; William M. McLaren; Graham R. S. Ritchie; Derek Albracht; Milinn Kremitzki; Susan Rock; Holland Kotkiewicz; Colin Kremitzki; Aye Wollam; Lee Trani; Lucinda Fulton; Robert S. Fulton; Lucy Matthews; S. Whitehead; William Chow; James Torrance; Matthew Dunn; Glenn Harden; Glen Threadgold; Jonathan Wood; Joanna Collins; Paul Heath; Guy Griffiths

I have read the journals policy and have the following conflicts: Paul Flicek is married to the deputy editor of PLoS Medicine, Melissa Norton. Evan Eichler is on the board of Pacific Biosciences. Support for this work came from the Intramural Research Program of the NIH, The National Library of Medicine, the European Molecular Biology Laboratory, the Wellcome Trust (grant number 077198), and the Howard Hughes Medical Institute (EEE). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.


Genome Biology | 2015

Extending reference assembly models.

Deanna M. Church; Valerie Schneider; Karyn Meltz Steinberg; Michael C. Schatz; Aaron R. Quinlan; Chen Shan Chin; Paul Kitts; Bronwen Aken; Gabor T. Marth; Michael M. Hoffman; Javier Herrero; M. Lisandra Zepeda Mendoza; Richard Durbin; Paul Flicek

The human genome reference assembly is crucial for aligning and analyzing sequence data, and for genome annotation, among other roles. However, the models and analysis assumptions that underlie the current assembly need revising to fully represent human sequence diversity. Improved analysis tools and updated data reporting formats are also required.


Nature Communications | 2016

Long-read sequencing and de novo assembly of a Chinese genome

Lingling Shi; Yunfei Guo; Chengliang Dong; John Huddleston; Hui Yang; Xiaolu Han; Aisi Fu; Quan Li; Na Li; Siyi Gong; Katherine E Lintner; Qiong Ding; Zou Wang; Jiang Hu; Depeng Wang; Feng Wang; Lin Wang; Gholson J. Lyon; Yongtao Guan; Yufeng Shen; Oleg V. Evgrafov; James A. Knowles; Françoise Thibaud-Nissen; Valerie Schneider; Chack Yung Yu; Libing Zhou; Evan E. Eichler; Kf So; Kai Wang

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.


Genome Research | 2014

Single haplotype assembly of the human genome from a hydatidiform mole

Karyn Meltz Steinberg; Valerie Schneider; Tina A. Graves-Lindsay; Robert S. Fulton; Richa Agarwala; John Huddleston; Sergey A. Shiryev; Aleksandr Morgulis; Urvashi Surti; Wesley C. Warren; Deanna M. Church; Evan E. Eichler; Richard K. Wilson

A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly.


Genome Research | 2017

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

Valerie Schneider; Tina A. Graves-Lindsay; Kerstin Howe; Nathan Bouk; Hsiu-Chuan Chen; Paul Kitts; Terence Murphy; Kim D. Pruitt; Françoise Thibaud-Nissen; Derek Albracht; Robert S. Fulton; Milinn Kremitzki; Vincent Magrini; Chris Markovic; Sean McGrath; Karyn Meltz Steinberg; Kate Auger; William Chow; Joanna Collins; Glenn Harden; Tim Hubbard; Sarah Pelan; Jared T. Simpson; Glen Threadgold; James Torrance; Jonathan Wood; Laura Clarke; Sergey Koren; Matthew Boitano; Paul Peluso

The human reference genome assembly plays a central role in nearly all aspects of todays basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.


G3: Genes, Genomes, Genetics | 2017

A New Chicken Genome Assembly Provides Insight into Avian Genome Structure

Wesley C. Warren; LaDeana W. Hillier; Chad Tomlinson; Patrick Minx; Milinn Kremitzki; Tina Graves; Chris Markovic; Nathan Bouk; Kim D. Pruitt; Françoise Thibaud-Nissen; Valerie Schneider; Tamer Mansour; C. Titus Brown; Aleksey V. Zimin; R. J. Hawken; Mitch Abrahamsen; Alexis B. Pyrkosz; Mireille Morisson; Valerie Fillon; Alain Vignal; William Chow; Kerstin Howe; Janet E. Fulton; Marcia M. Miller; Peter V. Lovell; Claudio V. Mello; Morgan Wirthlin; Andrew S. Mason; Richard Kuo; David W. Burt

The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts.


Nucleic Acids Research | 2017

The dbGaP data browser: a new tool for browsing dbGaP controlled-access genomic data

Kira M. Wong; Kristofor Langlais; Geoffrey S. Tobias; Colette Fletcher-Hoppe; Donna Krasnewich; Hilary S. Leeds; Laura Lyman Rodriguez; Georgy Godynskiy; Valerie Schneider; Erin M. Ramos; Stephen T. Sherry

The database of Genotypes and Phenotypes (dbGaP) Data Browser (https://www.ncbi.nlm.nih.gov/gap/ddb/) was developed in response to requests from the scientific community for a resource that enable view-only access to summary-level information and individual-level genotype and sequence data associated with phenotypic features maintained in the controlled-access tier of dbGaP. Until now, the dbGaP controlled-access environment required investigators to submit a data access request, wait for Data Access Committee review, download each data set and locally examine them for potentially relevant information. Existing unrestricted-access genomic data browsing resources (e.g. http://evs.gs.washington.edu/EVS/, http://exac.broadinstitute.org/) provide only summary statistics or aggregate allele frequencies. The dbGaP Data Browser serves as a third solution, providing researchers with view-only access to a compilation of individual-level data from general research use (GRU) studies through a simplified controlled-access process. The National Institutes of Health (NIH) will continue to improve the Browser in response to user feedback and believes that this tool may decrease unnecessary download requests, while still facilitating responsible genomic data-sharing.


Science | 2018

High-resolution comparative analysis of great ape genomes

Zev N. Kronenberg; Ian T Fiddes; David Gordon; Shwetha Murali; Stuart Cantsilieris; Olivia S. Meyerson; Jason G. Underwood; Bradley J. Nelson; Mark Chaisson; Max Dougherty; Katherine M. Munson; Alex Hastie; Mark Diekhans; Fereydoun Hormozdiari; Nicola Lorusso; Kendra Hoekzema; Ruolan Qiu; Karen Clark; Archana Raja; AnneMarie E. Welch; Melanie Sorensen; Carl Baker; Robert S. Fulton; Joel Armstrong; Tina A. Graves-Lindsay; Ahmet M. Denli; Emma R. Hoppe; Pinghsun Hsieh; Christopher M. Hill; Andy Wing Chun Pang

A spotlight on great ape genomes Most nonhuman primate genomes generated to date have been “humanized” owing to their many gaps and the reliance on guidance by the reference human genome. To remove this humanizing effect, Kronenberg et al. generated and assembled long-read genomes of a chimpanzee, an orangutan, and two humans and compared them with a previously generated gorilla genome. This analysis recognized genomic structural variation specific to humans and particular ape lineages. Comparisons between human and chimpanzee cerebral organoids showed down-regulation of the expression of specific genes in humans, relative to chimpanzees, related to noncoding variation identified in this analysis. Science, this issue p. eaar6343 Analysis of long-read great ape and human genomes identifies human-specific changes affecting brain gene expression. INTRODUCTION Understanding the genetic differences that make us human is a long-standing endeavor that requires the comprehensive discovery and comparison of all forms of genetic variation within great ape lineages. RATIONALE The varied quality and completeness of ape genomes have limited comparative genetic analyses. To eliminate this contiguity and quality disparity, we generated human and nonhuman ape genome assemblies without the guidance of the human reference genome. These new genome assemblies enable both coarse and fine-scale comparative genomic studies. RESULTS We sequenced and assembled two human, one chimpanzee, and one orangutan genome using high-coverage (>65x) single-molecule, real-time (SMRT) long-read sequencing technology. We also sequenced more than 500,000 full-length complementary DNA samples from induced pluripotent stem cells to construct de novo gene models, increasing our knowledge of transcript diversity in each ape lineage. The new nonhuman ape genome assemblies improve gene annotation and genomic contiguity (by 30- to 500-fold), resulting in the identification of larger synteny blocks (by 22- to 74-fold) when compared to earlier assemblies. Including the latest gorilla genome, we now estimate that 83% of the ape genomes can be compared in a multiple sequence alignment. We observe a modest increase in single-nucleotide variant divergence compared to previous genome analyses and estimate that 36% of human autosomal DNA is subject to incomplete lineage sorting. We fully resolve most common repeat differences, including full-length retrotransposons such as the African ape-specific endogenous retroviral element PtERV1. We show that the spread of this element independently in the gorilla and chimpanzee lineage likely resulted from a founder element that failed to segregate to the human lineage because of incomplete lineage sorting. The improved sequence contiguity allowed a more systematic discovery of structural variation (>50 base pairs in length) (see the figure). We detected 614,186 ape deletions, insertions, and inversions, assigning each to specific ape lineages. Unbiased genome scaffolding (optical maps, bacterial artificial chromosome sequencing, and fluorescence in situ hybridization) led to the discovery of large, unknown complex inversions in gene-rich regions. Of the 17,789 fixed human-specific insertions and deletions, we focus on those of potential functional effect. We identify 90 that are predicted to disrupt genes and an additional 643 that likely affect regulatory regions, more than doubling the number of human-specific deletions that remove regulatory sequence in the human lineage. We investigate the association of structural variation with changes in human-chimpanzee brain gene expression using cerebral organoids as a proxy for expression differences. Genes associated with fixed structural variants (SVs) show a pattern of down-regulation in human radial glial neural progenitors, whereas human-specific duplications are associated with up-regulated genes in human radial glial and excitatory neurons (see the figure). CONCLUSION The improved ape genome assemblies provide the most comprehensive view to date of intermediate-size structural variation and highlight several dozen genes associated with structural variation and brain-expression differences between humans and chimpanzees. These new references will provide a stepping stone for the completion of great ape genomes at a quality commensurate with the human reference genome and, ultimately, an understanding of the genetic differences that make us human. SMRT assemblies and SV analyses. (Top) Contiguity of the de novo assemblies. (Bottom, left to right) For each ape, SVdetection was done against the human reference genome as represented by a dot plot of an inversion). Human-specific SVs, identified by comparing ape SVs and population genotyping (0/0, homozygous reference),were compared to single-cell gene expression differences [range: low (dark blue) to high (dark red)] in primary and organoid tissues. Each heatmap row is a gene that intersects an insertion or deletion (green), duplication (cyan), or inversion (light green). Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single– to mega–base pair–sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.


Nucleic Acids Research | 2012

Clone DB: an integrated NCBI resource for clone-associated data

Valerie Schneider; Hsiu-Chuan Chen; Cliff Clausen; Peter Meric; Zhigang Zhou; Nathan Bouk; Nora Husain; Donna Maglott; Deanna M. Church

The National Center for Biotechnology Information (NCBI) Clone DB (http://www.ncbi.nlm.nih.gov/clone/) is an integrated resource providing information about and facilitating access to clones, which serve as valuable research reagents in many fields, including genome sequencing and variation analysis. Clone DB represents an expansion and replacement of the former NCBI Clone Registry and has records for genomic and cell-based libraries and clones representing more than 100 different eukaryotic taxa. Records provide details of library construction, associated sequences, map positions and information about resource distribution. Clone DB is indexed in the NCBI Entrez system and can be queried by fields that include organism, clone name, gene name and sequence identifier. Whenever possible, genomic clones are mapped to reference assemblies and their map positions provided in clone records. Clones mapping to specific genomic regions can also be searched for using the NCBI Clone Finder tool, which accepts queries based on sequence coordinates or features such as gene or transcript names. Clone DB makes reports of library, clone and placement data on its FTP site available for download. With Clone DB, users now have available to them a centralized resource that provides them with the tools they will need to make use of these important research reagents.


bioRxiv | 2016

High-Quality Assembly of an Individual of Yoruban Descent

Karyn Meltz Steinberg; Tina Graves-Lindsay; Valerie Schneider; Mark Chaisson; Chad Tomlinson; John Huddleston; Patrick Minx; Milinn Kremitzki; Derek Albrecht; Vincent Magrini; Sean McGrath; Archana Raja; Carl Baker; Lana Harshman; LaDeana W. Hillier; Françoise Thibaud-Nissen; Nathan Bouk; Amy Ly; Chris T. Amemiya; Joyce Tang; Evan E. Eichler; Robert S. Fulton; Wesley C. Warren; Deanna M. Church; Richard Wilson

De novo assembly of human genomes is now a tractable effort due in part to advances in sequencing and mapping technologies. We use PacBio single-molecule, real-time (SMRT) sequencing and BioNano genomic maps to construct the first de novo assembly of NA19240, a Yoruban individual from Africa. This chromosome-scaffolded assembly of 3.08 Gb with a contig N50 of 7.25 Mb and a scaffold N50 of 78.6 Mb represents one of the most contiguous high-quality human genomes. We utilize a BAC library derived from NA19240 DNA and novel haplotype-resolving sequencing technologies and algorithms to characterize regions of complex genomic architecture that are normally lost due to compression to a linear haploid assembly. Our results demonstrate that multiple technologies are still necessary for complete genomic representation, particularly in regions of highly identical segmental duplications. Additionally, we show that diploid assembly has utility in improving the quality of de novo human genome assemblies.

Collaboration


Dive into the Valerie Schneider's collaboration.

Top Co-Authors

Avatar

Deanna M. Church

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Karyn Meltz Steinberg

Washington University in St. Louis

View shared research outputs
Top Co-Authors

Avatar

Nathan Bouk

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Robert S. Fulton

Washington University in St. Louis

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tina A. Graves-Lindsay

Washington University in St. Louis

View shared research outputs
Researchain Logo
Decentralizing Knowledge