Brian J. Haas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brian J. Haas is active.

Explore More

Publication

Featured researches published by Brian J. Haas.

Nature Biotechnology | 2011

Full-length transcriptome assembly from RNA-Seq data without a reference genome

Manfred Grabherr; Brian J. Haas; Moran Yassour; Joshua Z. Levin; Dawn Anne Thompson; Ido Amit; Xian Adiconis; Lin Fan; Raktima Raychowdhury; Qiandong Zeng; Zehua Chen; Evan Mauceli; Nir Hacohen; Andreas Gnirke; Nicholas Rhind; Federica Di Palma; Bruce Birren; Chad Nusbaum; Kerstin Lindblad-Toh; Nir Friedman; Aviv Regev

Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

Bioinformatics | 2011

UCHIME improves sensitivity and speed of chimera detection

Robert C. Edgar; Brian J. Haas; Jose C. Clemente; Christopher Quince; Rob Knight

Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments. Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer. Contact: [email protected] Availability: Source, binaries and data: http://drive5.com/uchime. Supplementary information: Supplementary data are available at Bioinformatics online.

Genome Research | 2011

Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

Brian J. Haas; Dirk Gevers; Ashlee M. Earl; Mike Feldgarden; Doyle V. Ward; Georgia Giannoukos; Dawn Ciulla; Diana Tabbaa; Sarah K. Highlander; Erica Sodergren; Barbara A. Methé; Todd Z. DeSantis; Joseph F. Petrosino; Rob Knight; Bruce Birren

Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.

Nature | 2005

Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus.

William C. Nierman; Arnab Pain; Michael J. Anderson; Jennifer R. Wortman; H. Stanley Kim; Javier Arroyo; Matthew Berriman; Keietsu Abe; David B. Archer; Clara Bermejo; Joan W. Bennett; Paul Bowyer; Dan Chen; Matthew Collins; Richard Coulsen; Robert Davies; Paul S. Dyer; Mark L. Farman; Nadia Fedorova; Natalie D. Fedorova; Tamara V. Feldblyum; Reinhard Fischer; Nigel Fosker; Audrey Fraser; José Luis García; María José García; Ariette Goble; Gustavo H. Goldman; Katsuya Gomi; Sam Griffith-Jones

Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.

Nature | 2009

Genome sequence and analysis of the Irish potato famine pathogen Phytophthora infestans

Brian J. Haas; Sophien Kamoun; Michael C. Zody; Rays H. Y. Jiang; Robert E. Handsaker; Liliana M. Cano; Manfred Grabherr; Chinnappa D. Kodira; Sylvain Raffaele; Trudy Torto-Alalibo; Tolga O. Bozkurt; Audrey M. V. Ah-Fong; Lucia Alvarado; Vicky L. Anderson; Miles R. Armstrong; Anna O. Avrova; Laura Baxter; Jim Beynon; Petra C. Boevink; Stephanie R. Bollmann; Jorunn I. B. Bos; Vincent Bulone; Guohong Cai; Cahid Cakir; James C. Carrington; Megan Chawner; Lucio Conti; Stefano Costanzo; Richard Ewan; Noah Fahlgren

Phytophthora infestans is the most destructive pathogen of potato and a model organism for the oomycetes, a distinct lineage of fungus-like eukaryotes that are related to organisms such as brown algae and diatoms. As the agent of the Irish potato famine in the mid-nineteenth century, P. infestans has had a tremendous effect on human history, resulting in famine and population displacement. To this day, it affects world agriculture by causing the most destructive disease of potato, the fourth largest food crop and a critical alternative to the major cereal crops for feeding the world’s population. Current annual worldwide potato crop losses due to late blight are conservatively estimated at

Science | 2007

Genome sequence of Aedes aegypti, a major arbovirus vector

Vishvanath Nene; Jennifer R. Wortman; Daniel John Lawson; Brian J. Haas; Chinnappa D. Kodira; Zhijian Jake Tu; Brendan J. Loftus; Zhiyong Xi; Karyn Megy; Manfred Grabherr; Quinghu Ren; Evgeny M. Zdobnov; Neil F. Lobo; Kathryn S. Campbell; Susan E. Brown; Maria F. Bonaldo; Jingsong Zhu; Steven P. Sinkins; David G. Hogenkamp; Paolo Amedeo; Peter Arensburger; Peter W. Atkinson; Shelby Bidwell; Jim Biedler; Ewan Birney; Robert V. Bruggner; Javier Costas; Monique R. Coy; Jonathan Crabtree; Matt Crawford

6.7 billion. Management of this devastating pathogen is challenged by its remarkable speed of adaptation to control strategies such as genetically resistant cultivars. Here we report the sequence of the P. infestans genome, which at ∼240 megabases (Mb) is by far the largest and most complex genome sequenced so far in the chromalveolates. Its expansion results from a proliferation of repetitive DNA accounting for ∼74% of the genome. Comparison with two other Phytophthora genomes showed rapid turnover and extensive expansion of specific families of secreted disease effector proteins, including many genes that are induced during infection or are predicted to have activities that alter host physiology. These fast-evolving effector genes are localized to highly dynamic and expanded regions of the P. infestans genome. This probably plays a crucial part in the rapid adaptability of the pathogen to host plants and underpins its evolutionary potential.

Nucleic Acids Research | 2007

The TIGR Rice Genome Annotation Resource: improvements and new features

Shu Ouyang; Wei Zhu; John A. Hamilton; Haining Lin; Matthew Campbell; Kevin L. Childs; Françoise Thibaud-Nissen; Renae L. Malek; Yuandan Lee; Li Zheng; Joshua Orvis; Brian J. Haas; Jennifer R. Wortman; C. Robin Buell

We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at ∼1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of ∼4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of ∼2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species.

Nature | 2009

The genome of the blood fluke Schistosoma mansoni

Matthew Berriman; Brian J. Haas; Philip T. LoVerde; R. Alan Wilson; Gary P. Dillon; Gustavo C. Cerqueira; Susan T. Mashiyama; Bissan Al-Lazikani; Luiza F. Andrade; Peter D. Ashton; Martin Aslett; Daniella Castanheira Bartholomeu; Gaëlle Blandin; Conor R. Caffrey; Avril Coghlan; Richard M. R. Coulson; Tim A. Day; Arthur L. Delcher; Ricardo DeMarco; Appoliniare Djikeng; Tina Eyre; John Gamble; Elodie Ghedin; Yong-Hong Gu; Christiane Hertz-Fowler; Hirohisha Hirai; Yuriko Hirai; Robin Houston; Alasdair Ivens; David A. Johnston

In The Institute for Genomic Research Rice Genome Annotation project (), we have continued to update the rice genome sequence with new data and improve the quality of the annotation. In our current release of annotation (Release 4.0; January 12, 2006), we have identified 42 653 non-transposable element-related genes encoding 49 472 gene models as a result of the detection of alternative splicing. We have refined our identification methods for transposable element-related genes resulting in 13 237 genes that are related to transposable elements. Through incorporation of multiple transcript and proteomic expression data sets, we have been able to annotate 24 799 genes (31 739 gene models), representing ∼50% of the total gene models, as expressed in the rice genome. All structural and functional annotation is viewable through our Rice Genome Browser which currently supports 59 tracks. Enhanced data access is available through web interfaces, FTP downloads and a Data Extractor tool developed in order to support discrete dataset downloads.

Nature | 2008

The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus)

Ray Ming; Shaobin Hou; Yun Feng; Qingyi Yu; Alexandre Dionne-Laporte; Jimmy H. Saw; Pavel Senin; Wei Wang; Benjamin V. Ly; Kanako L. T. Lewis; Lu Feng; Meghan R. Jones; Rachel L. Skelton; Jan E. Murray; Cuixia Chen; Wubin Qian; Junguo Shen; Peng Du; Moriah Eustice; Eric J. Tong; Haibao Tang; Eric Lyons; Robert E. Paull; Todd P. Michael; Kerr Wall; Danny W. Rice; Henrik H. Albert; Ming Li Wang; Yun J. Zhu; Michael C. Schatz

Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.

Science | 2007

Draft Genome of the Filarial Nematode Parasite Brugia malayi

Elodie Ghedin; Shiliang Wang; David J. Spiro; Elisabet Caler; Qi Zhao; Jonathan Crabtree; Jonathan E. Allen; Arthur L. Delcher; David B. Guiliano; Diego Miranda-Saavedra; Samuel V. Angiuoli; Todd Creasy; Paolo Amedeo; Brian J. Haas; Najib M. El-Sayed; Jennifer R. Wortman; Tamara Feldblyum; Luke J. Tallon; Michael C. Schatz; Martin Shumway; Hean Koo; Seth Schobel; Mihaela Pertea; Mihai Pop; Owen White; Geoffrey J. Barton; Clotilde K. S. Carlow; Michael J. Crawford; Jennifer Daub; Matthew W. Dimmic

Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3× draft genome sequence of ‘SunUp’ papaya, the first commercial virus-resistant transgenic fruit tree to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica’s distinguishing morpho-physiological, medicinal and nutritional properties.

Explore More