Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Aaron L. Halpern is active.

Publication


Featured researches published by Aaron L. Halpern.


PLOS Biology | 2007

The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

Douglas B. Rusch; Aaron L. Halpern; Granger Sutton; Karla B. Heidelberg; Shannon J. Williamson; Shibu Yooseph; Dongying Wu; Jonathan A. Eisen; Jeff Hoffman; Karin A. Remington; Karen Beeson; Bao Duc Tran; Hamilton O. Smith; Holly Baden-Tillson; Clare Stewart; Joyce Thorpe; Jason Freeman; Cynthia Andrews-Pfannkoch; Joseph E. Venter; Kelvin Li; Saul Kravitz; John F. Heidelberg; Terry Utterback; Yu-Hui Rogers; Luisa I. Falcón; Valeria Souza; Germán Bonilla-Rosso; Luis E. Eguiarte; David M. Karl; Shubha Sathyendranath

The worlds oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.


PLOS Biology | 2007

The Diploid Genome Sequence of an Individual Human

Samuel Levy; Granger Sutton; Pauline C. Ng; Lars Feuk; Aaron L. Halpern; Brian Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F. Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R. MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B. Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul Kravitz; Dana Busam; Karen Beeson; Tina McIntosh; Karin A. Remington; Josep F. Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin Frazier; Stephen W. Scherer; Robert L. Strausberg

Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.


Science | 2010

Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays

Radoje Drmanac; Andrew Sparks; Matthew J. Callow; Aaron L. Halpern; Norman L. Burns; Bahram Ghaffarzadeh Kermani; Paolo Carnevali; Igor Nazarenko; Geoffrey B. Nilsen; George Yeung; Fredrik Dahl; Andres Fernandez; Bryan Staker; Krishna Pant; Jonathan Baccash; Adam P. Borcherding; Anushka Brownley; Ryan Cedeno; Linsu Chen; Dan Chernikoff; Alex Cheung; Razvan Chirita; Benjamin Curson; Jessica Ebert; Coleen R. Hacker; Robert Hartlage; Brian Hauser; Steve Huang; Yuan Jiang; Vitali Karpinchyk

Toward


PLOS Biology | 2007

The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families

Shibu Yooseph; Granger Sutton; Douglas B. Rusch; Aaron L. Halpern; Shannon J. Williamson; Karin A. Remington; Jonathan A. Eisen; Karla B. Heidelberg; Gerard Manning; Weizhong Li; Lukasz Jaroszewski; Piotr Cieplak; Christopher S. Miller; Huiying Li; Susan T. Mashiyama; Marcin P Joachimiak; Christopher van Belle; John-Marc Chandonia; David A W Soergel; Yufeng Zhai; Kannan Natarajan; Shaun W. Lee; Benjamin J. Raphael; Vineet Bafna; Robert Friedman; Steven E. Brenner; Adam Godzik; David Eisenberg; Jack E. Dixon; Susan S. Taylor

1000 Genomes The ability to generate human genome sequence data that is complete, accurate, and inexpensive is a necessary prerequisite to perform genome-wide disease association studies. Drmanac et al. (p. 78, published online 5 November) present a technique advancing toward this goal. The method uses Type IIS endonucleases to incorporate short oligonucleotides within a set of randomly sheared circularized DNA. DNA polymerase then generates concatenated copies of the circular oligonucleotides leading to formation of compact but very long oligonucleotides which are then sequenced by ligation. The relatively low cost of this technology, which shows a low error rate, advances sequencing closer to the goal of the


Genome Biology | 2002

Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence

Susan E. Celniker; David A. Wheeler; Brent Kronmiller; Joseph W. Carlson; Aaron L. Halpern; Sandeep Patel; Mark D. Adams; Mark Champe; Shannon Dugan; Erwin Frise; Ann Hodgson; Reed A. George; Roger A. Hoskins; Todd R. Laverty; Donna M. Muzny; Catherine R. Nelson; Joanne Pacleb; Soo Park; Barret D. Pfeiffer; Stephen Richards; Erica Sodergren; Robert Svirskas; Paul E. Tabor; Kenneth H. Wan; Mark Stapleton; Granger Sutton; Craig Venter; George M. Weinstock; Steven E. Scherer; Eugene W. Myers

1000 genome. A low-cost sequencing technique advances us closer to the goal of the


PLOS Biology | 2007

Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

Byrappa Venkatesh; Ewen F. Kirkness; Yong-Hwee Eddie Loh; Aaron L. Halpern; Alison Lee; Justin Johnson; Nidhi Dandona; Lakshmi Viswanathan; Alice Tay; J. Craig Venter; Robert L. Strausberg; Sydney Brenner

1000 human genome. Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high accuracy, affordable cost of


The ISME Journal | 2012

Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage

Chris L. Dupont; Douglas B. Rusch; Shibu Yooseph; Mary-Jane Lombardo; R. Alexander Richter; Ruben E. Valas; Mark Novotny; Joyclyn Yee-Greenbaum; Jeremy D. Selengut; Daniel H. Haft; Aaron L. Halpern; Roger S. Lasken; Kenneth H. Nealson; Robert M. Friedman; J. Craig Venter

4400 for sequencing consumables, and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies.


PLOS ONE | 2008

The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples

Shannon J. Williamson; Douglas B. Rusch; Shibu Yooseph; Aaron L. Halpern; Karla B. Heidelberg; John I. Glass; Cynthia Andrews-Pfannkoch; Douglas W. Fadrosh; Christopher S. Miller; Granger Sutton; Marvin Frazier; J. Craig Venter

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.


Nature | 2004

Shotgun sequence assembly and recent segmental duplications within the human genome

Xinwei She; Zhaoshi Jiang; Royden A. Clark; Ge Liu; Ze Cheng; Eray Tuzun; Deanna M. Church; Granger Sutton; Aaron L. Halpern; Evan E. Eichler

BackgroundThe Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.ResultsOur finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.ConclusionsThe WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.


PLOS Genetics | 2007

Nanoliter Reactors Improve Multiple Displacement Amplification of Genomes from Single Cells

Yann Marcy; Thomas Ishoey; Roger S. Lasken; Timothy B. Stockwell; Brian Walenz; Aaron L. Halpern; Karen Beeson; Susanne M. D. Goldberg; Stephen R. Quake

Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

Collaboration


Dive into the Aaron L. Halpern's collaboration.

Top Co-Authors

Avatar

Granger Sutton

J. Craig Venter Institute

View shared research outputs
Top Co-Authors

Avatar

J. Craig Venter

J. Craig Venter Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shibu Yooseph

J. Craig Venter Institute

View shared research outputs
Top Co-Authors

Avatar

Douglas B. Rusch

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Knut Reinert

Free University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Brian Walenz

J. Craig Venter Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge