Brian Walenz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brian Walenz is active.

Explore More

Publication

Featured researches published by Brian Walenz.

PLOS Biology | 2007

The Diploid Genome Sequence of an Individual Human

Samuel Levy; Granger Sutton; Pauline C. Ng; Lars Feuk; Aaron L. Halpern; Brian Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F. Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R. MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B. Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul Kravitz; Dana Busam; Karen Beeson; Tina McIntosh; Karin A. Remington; Josep F. Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin Frazier; Stephen W. Scherer; Robert L. Strausberg

Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

Nature Biotechnology | 2012

Hybrid error correction and de novo assembly of single-molecule sequencing reads

Sergey Koren; Michael C. Schatz; Brian Walenz; Jeffrey Martin; Jason T. Howard; Ganeshkumar Ganapathy; Zhong Wang; David A. Rasko; W. Richard McCombie; Erich D. Jarvis; Adam M. Phillippy

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.

Nature | 2010

The dynamic genome of Hydra

Jarrod Chapman; Ewen F. Kirkness; Oleg Simakov; Steven E. Hampson; Therese Mitros; Therese Weinmaier; Thomas Rattei; Prakash G. Balasubramanian; Jon Borman; Dana Busam; Kathryn Disbennett; Cynthia Pfannkoch; Nadezhda Sumin; Granger Sutton; Lakshmi Viswanathan; Brian Walenz; David Goodstein; Uffe Hellsten; Takeshi Kawashima; Simon Prochnik; Nicholas H. Putnam; Shengquiang Shu; Bruce Blumberg; Catherine E. Dana; Lydia Gee; Dennis F. Kibler; Lee Law; Dirk Lindgens; Daniel E. Martínez; Jisong Peng

The freshwater cnidarian Hydra was first described in 1702 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals. Today, Hydra is an important model for studies of axial patterning, stem cell biology and regeneration. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann–Mangold organizer, pluripotency genes and the neuromuscular junction.

Genome Research | 2017

Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation

Sergey Koren; Brian Walenz; Konstantin Berlin; Jason R. Miller; Nicholas H. Bergman; Adam M. Phillippy

Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.

Proceedings of the National Academy of Sciences of the United States of America | 2010

Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle

Ewen F. Kirkness; Brian J. Haas; Weilin Sun; Henk R. Braig; M. Alejandra Perotti; John M. Clark; Si Hyeock Lee; Hugh M. Robertson; Ryan C. Kennedy; Eran Elhaik; Daniel Gerlach; Evgenia V. Kriventseva; Christine G. Elsik; Dan Graur; Catherine A. Hill; Jan A. Veenstra; Brian Walenz; Jose M. C. Tubio; José M. C. Ribeiro; Julio Rozas; J. Spencer Johnston; Justin T. Reese; Aleksandar Popadić; Marta Tojo; Didier Raoult; David L. Reed; Yoshinori Tomoyasu; Emily Kraus; Omprakash Mittapalli; Venu M. Margam

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.

Nature | 2012

The bonobo genome compared with the chimpanzee and human genomes

Kay Prüfer; Kasper Munch; Ines Hellmann; Keiko Akagi; Jason R. Miller; Brian Walenz; Sergey Koren; Granger Sutton; Chinnappa D. Kodira; Roger Winer; James Knight; James C. Mullikin; Stephen Meader; Chris P. Ponting; Gerton Lunter; Saneyuki Higashino; Asger Hobolth; Julien Y. Dutheil; Emre Karakoc; Can Alkan; Saba Sajjadian; Claudia Rita Catacchio; Mario Ventura; Tomas Marques-Bonet; Evan E. Eichler; Claudine André; Rebeca Atencia; Lawrence Mugisha; Jörg Junhold; Nick Patterson

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.

Nature | 2016

The Atlantic salmon genome provides insights into rediploidization

Sigbjørn Lien; Ben F. Koop; Simen Rød Sandve; Jason R. Miller; Matthew Kent; Torfinn Nome; Torgeir R. Hvidsten; Jong Leong; David R. Minkley; Aleksey V. Zimin; Fabian Grammes; Harald Grove; Arne B. Gjuvsland; Brian Walenz; Russell A. Hermansen; Kristian R. von Schalburg; Eric B. Rondeau; Alex Di Genova; Jeevan Karloss Antony Samy; Jon Olav Vik; Magnus Dehli Vigeland; Lis Caler; Unni Grimholt; Sissel Jentoft; Dag Inge Våge; Pieter J. de Jong; Thomas Moen; Matthew Baranski; Yniv Palti; Douglas W. Smith

The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.

Science | 2010

Widespread divergence between incipient Anopheles gambiae species revealed by whole genome sequences

Mara K. N. Lawniczak; Scott J. Emrich; Alisha K. Holloway; A. P. Regier; Maynard V. Olson; Bradley J. White; Seth Redmond; Lucinda Fulton; Elizabeth L. Appelbaum; Jennifer Godfrey; Candace N. Farmer; Asif T. Chinwalla; Shiaw-Pyng Yang; Patrick Minx; Joanne O. Nelson; Kim Kyung; Brian Walenz; E. Garcia-Hernandez; M. Aguiar; L. D. Viswanathan; Yu Hui Rogers; Robert L. Strausberg; C. A. Saski; Daniel John Lawson; Frank H. Collins; Fotis C. Kafatos; G. K. Christophides; Sandra W. Clifton; Ewen F. Kirkness; Nora J. Besansky

Signals of Mosquito Speciation Malaria in Africa is transmitted by the mosquito species complex Anopheles gambiae. Neafsey et al. (p. 514) made high-resolution single-nucleotide arrays to map genetic divergence among members of the species. Differentiation between populations was observed and evidence obtained for selective sweeps within populations. Most divergence occurred within inversion regions around the centrosome and in genes associated with development, pheromone signaling, and from the X chromosome. The analysis also revealed signals of sympatric speciation occurring within similar chromosomal regions in mosquitoes from different regions in Africa. Lawniczak et al. (p. 512) sequenced the genomes of two molecular forms (known as M and S) of A. gambiae, which have distinctive behavioral phenotypes and appear to be speciating. This effort resolves problems arising from the apparently chimeric nature of the reference genome and confirms the observed genome-wide divergences. This kind of analysis has the potential to contribute to control programs that can adapt to population shifts in mosquito behavior arising from the selective effects of the control measures themselves. Gene flow among African malaria vectors is more restricted than previously thought. The Afrotropical mosquito Anopheles gambiae sensu stricto, a major vector of malaria, is currently undergoing speciation into the M and S molecular forms. These forms have diverged in larval ecology and reproductive behavior through unknown genetic mechanisms, despite considerable levels of hybridization. Previous genome-wide scans using gene-based microarrays uncovered divergence between M and S that was largely confined to gene-poor pericentromeric regions, prompting a speciation-with-ongoing-gene-flow model that implicated only about 3% of the genome near centromeres in the speciation process. Here, based on the complete M and S genome sequences, we report widespread and heterogeneous genomic divergence inconsistent with appreciable levels of interform gene flow, suggesting a more advanced speciation process and greater challenges to identify genes critical to initiating that process.

PLOS Genetics | 2007

Nanoliter Reactors Improve Multiple Displacement Amplification of Genomes from Single Cells

Yann Marcy; Thomas Ishoey; Roger S. Lasken; Timothy B. Stockwell; Brian Walenz; Aaron L. Halpern; Karen Beeson; Susanne M. D. Goldberg; Stephen R. Quake

Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA) method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-μl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells.

Proceedings of the National Academy of Sciences of the United States of America | 2004

Whole-genome shotgun assembly and comparison of human genome assemblies

Sorin Istrail; Granger Sutton; Liliana Florea; Aaron L. Halpern; Clark M. Mobarry; Ross A. Lippert; Brian Walenz; Hagit Shatkay; Ian M. Dew; Jason R. Miller; Michael Flanigan; Nathan Edwards; Randall Bolanos; Daniel Fasulo; Bjarni V. Halldórsson; Sridhar Hannenhalli; Russell Turner; Shibu Yooseph; Fu Lu; Deborah Nusskern; Bixiong Shue; Xiangqun Holly Zheng; Fei Zhong; Arthur L. Delcher; Daniel H. Huson; Saul Kravitz; Laurent Mouchard; Knut Reinert; Karin A. Remington; Andrew G. Clark

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304–1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860–921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.

Explore More