Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andreas Gnirke is active.

Publication


Featured researches published by Andreas Gnirke.


Nature Biotechnology | 2011

Full-length transcriptome assembly from RNA-Seq data without a reference genome

Manfred Grabherr; Brian J. Haas; Moran Yassour; Joshua Z. Levin; Dawn Anne Thompson; Ido Amit; Xian Adiconis; Lin Fan; Raktima Raychowdhury; Qiandong Zeng; Zehua Chen; Evan Mauceli; Nir Hacohen; Andreas Gnirke; Nicholas Rhind; Federica Di Palma; Bruce Birren; Chad Nusbaum; Kerstin Lindblad-Toh; Nir Friedman; Aviv Regev

Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.


Science | 2009

Comprehensive mapping of long range interactions reveals folding principles of the human genome

Erez Lieberman-Aiden; Nynke L. van Berkum; Louise Williams; Maxim Imakaev; Tobias Ragoczy; Agnes Telling; Ido Amit; Bryan R. Lajoie; Peter J. Sabo; Michael O. Dorschner; Richard Sandstrom; Bradley E. Bernstein; Michael Bender; Mark Groudine; Andreas Gnirke; John A. Stamatoyannopoulos; Leonid A. Mirny; Eric S. Lander

Chromosomal Mapping The conformation of the genome in the nucleus and contacts between both proximal and distal loci influence gene expression. In order to map genomic contacts, Lieberman-Aiden et al. (p. 289, see the cover) developed a technique to allow the detection of all interactions between genomic loci in the eukaryotic nucleus followed by deep sequencing. This technology was used to map the organization of the human genome and to examine the spatial proximity of chromosomal loci at one megabase resolution. The map suggests that the genome is partitioned into two spatial compartments that are related to local chromatin state and whose remodeling correlates with changes in the chromatin state. Chromosomes are organized in a fractal knot-free conformation that is densely packed while easily folded and unfolded. We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1 megabase. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free, polymer conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.


Nature | 2008

Genome-scale DNA methylation maps of pluripotent and differentiated cells

Alexander Meissner; Tarjei S. Mikkelsen; Hongcang Gu; Marius Wernig; Jacob Hanna; Andrey Sivachenko; Xiaolan Zhang; Bradley E. Bernstein; Chad Nusbaum; David B. Jaffe; Andreas Gnirke; Rudolf Jaenisch; Eric S. Lander

DNA methylation is essential for normal development and has been implicated in many pathologies including cancer. Our knowledge about the genome-wide distribution of DNA methylation, how it changes during cellular differentiation and how it relates to histone methylation and other chromatin modifications in mammals remains limited. Here we report the generation and analysis of genome-scale DNA methylation profiles at nucleotide resolution in mammalian cells. Using high-throughput reduced representation bisulphite sequencing and single-molecule-based sequencing, we generated DNA methylation maps covering most CpG islands, and a representative sampling of conserved non-coding elements, transposons and other genomic features, for mouse embryonic stem cells, embryonic-stem-cell-derived and primary neural cells, and eight other primary tissues. Several key findings emerge from the data. First, DNA methylation patterns are better correlated with histone methylation patterns than with the underlying genome sequence context. Second, methylation of CpGs are dynamic epigenetic marks that undergo extensive changes during cellular differentiation, particularly in regulatory regions outside of core promoters. Third, analysis of embryonic-stem-cell-derived and primary cells reveals that ‘weak’ CpG islands associated with a specific set of developmentally regulated genes undergo aberrant hypermethylation during extended proliferation in vitro, in a pattern reminiscent of that reported in some primary tumours. More generally, the results establish reduced representation bisulphite sequencing as a powerful technology for epigenetic profiling of cell populations relevant to developmental biology, cancer and regenerative medicine.


Proceedings of the National Academy of Sciences of the United States of America | 2011

High-quality draft assemblies of mammalian genomes from massively parallel sequence data

Sante Gnerre; Iain MacCallum; Dariusz Przybylski; Filipe J. Ribeiro; Joshua N. Burton; Bruce J. Walker; Ted Sharpe; Giles Hall; Terrance Shea; Sean Sykes; Aaron M. Berlin; Daniel Aird; Maura Costello; Riza Daza; Louise Williams; Robert Nicol; Andreas Gnirke; Chad Nusbaum; Eric S. Lander; David B. Jaffe

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.


Nature Biotechnology | 2010

Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs

Mitchell Guttman; Manuel Garber; Joshua Z. Levin; Julie Donaghey; James Robinson; Xian Adiconis; Lin Fan; Magdalena J. Koziol; Andreas Gnirke; Chad Nusbaum; John L. Rinn; Eric S. Lander; Aviv Regev

Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5′ start sites, 3′ ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.


Nature | 2006

Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis

Jörg Kämper; Regine Kahmann; Michael Bölker; Li-Jun Ma; Thomas Brefort; Barry J. Saville; Flora Banuett; James W. Kronstad; Scott E. Gold; Olaf Müller; Michael H. Perlin; Han A. B. Wösten; Ronald P. de Vries; José Ruiz-Herrera; Cristina G. Reynaga-Peña; Karen M. Snetselaar; Michael McCann; José Pérez-Martín; Michael Feldbrügge; Christoph W. Basse; Gero Steinberg; Jose I. Ibeas; William Holloman; Plinio Guzman; Mark L. Farman; Jason E. Stajich; Rafael Sentandreu; Juan M. González-Prieto; John C. Kennell; Lázaro Molina

Ustilago maydis is a ubiquitous pathogen of maize and a well-established model organism for the study of plant–microbe interactions. This basidiomycete fungus does not use aggressive virulence strategies to kill its host. U. maydis belongs to the group of biotrophic parasites (the smuts) that depend on living tissue for proliferation and development. Here we report the genome sequence for a member of this economically important group of biotrophic fungi. The 20.5-million-base U. maydis genome assembly contains 6,902 predicted protein-encoding genes and lacks pathogenicity signatures found in the genomes of aggressive pathogenic fungi, for example a battery of cell-wall-degrading enzymes. However, we detected unexpected genomic features responsible for the pathogenicity of this organism. Specifically, we found 12 clusters of genes encoding small secreted proteins with unknown function. A significant fraction of these genes exists in small gene families. Expression analysis showed that most of the genes contained in these clusters are regulated together and induced in infected tissue. Deletion of individual clusters altered the virulence of U. maydis in five cases, ranging from a complete lack of symptoms to hypervirulence. Despite years of research into the mechanism of pathogenicity in U. maydis, no ‘true’ virulence factors had been previously identified. Thus, the discovery of the secreted protein gene clusters and the functional demonstration of their decisive role in the infection process illuminate previously unknown mechanisms of pathogenicity operating in biotrophic fungi. Genomic analysis is, similarly, likely to open up new avenues for the discovery of virulence determinants in other pathogens.


Cell | 2011

Reference Maps of Human ES and iPS Cell Variation Enable High-Throughput Characterization of Pluripotent Cell Lines

Christoph Bock; Evangelos Kiskinis; Griet Verstappen; Hongcang Gu; Gabriella L. Boulting; Zachary D. Smith; Michael J. Ziller; Gist F. Croft; Mackenzie W. Amoroso; Derek Oakley; Andreas Gnirke; Kevin Eggan; Alexander Meissner

The developmental potential of human pluripotent stem cells suggests that they can produce disease-relevant cell types for biomedical research. However, substantial variation has been reported among pluripotent cell lines, which could affect their utility and clinical safety. Such cell-line-specific differences must be better understood before one can confidently use embryonic stem (ES) or induced pluripotent stem (iPS) cells in translational research. Toward this goal we have established genome-wide reference maps of DNA methylation and gene expression for 20 previously derived human ES lines and 12 human iPS cell lines, and we have measured the in vitro differentiation propensity of these cell lines. This resource enabled us to assess the epigenetic and transcriptional similarity of ES and iPS cells and to predict the differentiation efficiency of individual cell lines. The combination of assays yields a scorecard for quick and comprehensive characterization of pluripotent cell lines.


Nature | 2013

Charting a dynamic DNA methylation landscape of the human genome

Michael J. Ziller; Hongcang Gu; Fabian Müller; Julie Donaghey; Linus T.-Y. Tsai; Oliver Kohlbacher; Philip L. De Jager; Evan D. Rosen; David A. Bennett; Bradley E. Bernstein; Andreas Gnirke; Alexander Meissner

DNA methylation is a defining feature of mammalian cellular identity and is essential for normal development. Most cell types, except germ cells and pre-implantation embryos, display relatively stable DNA methylation patterns, with 70–80% of all CpGs being methylated. Despite recent advances, we still have a limited understanding of when, where and how many CpGs participate in genomic regulation. Here we report the in-depth analysis of 42 whole-genome bisulphite sequencing data sets across 30 diverse human cell and tissue types. We observe dynamic regulation for only 21.8% of autosomal CpGs within a normal developmental context, most of which are distal to transcription start sites. These dynamic CpGs co-localize with gene regulatory elements, particularly enhancers and transcription-factor-binding sites, which allow identification of key lineage-specific regulators. In addition, differentially methylated regions (DMRs) often contain single nucleotide polymorphisms associated with cell-type-related diseases as determined by genome-wide association studies. The results also highlight the general inefficiency of whole-genome bisulphite sequencing, as 70–80% of the sequencing reads across these data sets provided little or no relevant information about CpG methylation. To demonstrate further the utility of our DMR set, we use it to classify unknown samples and identify representative signature regions that recapitulate major DNA methylation dynamics. In summary, although in theory every CpG can change its methylation state, our results suggest that only a fraction does so as part of coordinated regulatory programs. Therefore, our selected DMRs can serve as a starting point to guide new, more effective reduced representation approaches to capture the most informative fraction of CpGs, as well as further pinpoint putative regulatory elements.


Science | 2014

Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak

Stephen K. Gire; Augustine Goba; Kristian G. Andersen; Rachel Sealfon; Daniel J. Park; Lansana Kanneh; Simbirie Jalloh; Mambu Momoh; Mohamed Fullah; Gytis Dudas; Shirlee Wohl; Lina M. Moses; Nathan L. Yozwiak; Sarah M. Winnicki; Christian B. Matranga; Christine M. Malboeuf; James Qu; Adrianne D. Gladden; Stephen F. Schaffner; Xiao Yang; Pan Pan Jiang; Mahan Nekoui; Andres Colubri; Moinya Ruth Coomber; Mbalu Fonnie; Alex Moigboi; Michael Gbakie; Fatima K. Kamara; Veronica Tucker; Edwin Konuwa

In its largest outbreak, Ebola virus disease is spreading through Guinea, Liberia, Sierra Leone, and Nigeria. We sequenced 99 Ebola virus genomes from 78 patients in Sierra Leone to ~2000× coverage. We observed a rapid accumulation of interhost and intrahost genetic variation, allowing us to characterize patterns of viral transmission over the initial weeks of the epidemic. This West African variant likely diverged from central African lineages around 2004, crossed from Guinea to Sierra Leone in May 2014, and has exhibited sustained human-to-human transmission subsequently, with no evidence of additional zoonotic sources. Because many of the mutations alter protein sequences and other biologically meaningful targets, they should be monitored for impact on diagnostics, vaccines, and therapies critical to outbreak response.


Nucleic Acids Research | 2005

Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis

Alexander Meissner; Andreas Gnirke; George W. Bell; Bernard Ramsahoye; Eric S. Lander; Rudolf Jaenisch

We describe a large-scale random approach termed reduced representation bisulfite sequencing (RRBS) for analyzing and comparing genomic methylation patterns. BglII restriction fragments were size-selected to 500–600 bp, equipped with adapters, treated with bisulfite, PCR amplified, cloned and sequenced. We constructed RRBS libraries from murine ES cells and from ES cells lacking DNA methyltransferases Dnmt3a and 3b and with knocked-down (kd) levels of Dnmt1 (Dnmt[1kd,3a−/−,3b−/−]). Sequencing of 960 RRBS clones from Dnmt[1kd,3a−/−,3b−/−] cells generated 343 kb of non-redundant bisulfite sequence covering 66212 cytosines in the genome. All but 38 cytosines had been converted to uracil indicating a conversion rate of >99.9%. Of the remaining cytosines 35 were found in CpG and 3 in CpT dinucleotides. Non-CpG methylation was >250-fold reduced compared with wild-type ES cells, consistent with a role for Dnmt3a and/or Dnmt3b in CpA and CpT methylation. Closer inspection revealed neither a consensus sequence around the methylated sites nor evidence for clustering of residual methylation in the genome. Our findings indicate random loss rather than specific maintenance of methylation in Dnmt[1kd,3a−/−,3b−/−] cells. Near-complete bisulfite conversion and largely unbiased representation of RRBS libraries suggest that random shotgun bisulfite sequencing can be scaled to a genome-wide approach.

Collaboration


Dive into the Andreas Gnirke's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aviv Regev

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge