Dariusz Przybylski
Broad Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dariusz Przybylski.
Proceedings of the National Academy of Sciences of the United States of America | 2011
Sante Gnerre; Iain MacCallum; Dariusz Przybylski; Filipe J. Ribeiro; Joshua N. Burton; Bruce J. Walker; Ted Sharpe; Giles Hall; Terrance Shea; Sean Sykes; Aaron M. Berlin; Daniel Aird; Maura Costello; Riza Daza; Louise Williams; Robert Nicol; Andreas Gnirke; Chad Nusbaum; Eric S. Lander; David B. Jaffe
Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.
Nature | 2013
Chris T. Amemiya; Jessica Alföldi; Alison P. Lee; Shaohua Fan; Hervé Philippe; Iain MacCallum; Ingo Braasch; Tereza Manousaki; Igor Schneider; Nicolas Rohner; Chris Organ; Domitille Chalopin; Jeramiah J. Smith; Mark Robinson; Rosemary A. Dorrington; Marco Gerdol; Bronwen Aken; Maria Assunta Biscotti; Marco Barucca; Denis Baurain; Aaron M. Berlin; Francesco Buonocore; Thorsten Burmester; Michael S. Campbell; Adriana Canapa; John P. Cannon; Alan Christoffels; Gianluca De Moro; Adrienne L. Edkins; Lin Fan
The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.
Genome Research | 2011
Dent Earl; Keith Bradnam; John St. John; Aaron E. Darling; Dawei Lin; Joseph Fass; Hung On Ken Yu; Vince Buffalo; Daniel R. Zerbino; Mark Diekhans; Ngan Nguyen; Pramila Ariyaratne; Wing-Kin Sung; Zemin Ning; Matthias Haimel; Jared T. Simpson; Nuno A. Fonseca; Inanc Birol; T. Roderick Docking; Isaac Ho; Daniel S. Rokhsar; Rayan Chikhi; Dominique Lavenier; Guillaume Chapuis; Delphine Naquin; Nicolas Maillet; Michael C. Schatz; David R. Kelley; Adam M. Phillippy; Sergey Koren
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.
Nature | 2014
David Brawand; Catherine E. Wagner; Yang I. Li; Milan Malinsky; Irene Keller; Shaohua Fan; Oleg Simakov; Alvin Yu Jin Ng; Zhi Wei Lim; Etienne Bezault; Jason Turner-Maier; Jeremy A. Johnson; Rosa M. Alcazar; Hyun Ji Noh; Pamela Russell; Bronwen Aken; Jessica Alföldi; Chris T. Amemiya; Naoual Azzouzi; Jean-François Baroiller; Frédérique Barloy-Hubler; Aaron M. Berlin; Ryan F. Bloomquist; Karen L. Carleton; Matthew A. Conte; Helena D'Cotta; Orly Eshel; Leslie Gaffney; Francis Galibert; Hugo F. Gante
Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.
GigaScience | 2013
Keith Bradnam; Joseph Fass; Anton Alexandrov; Paul Baranay; Michael Bechner; Inanc Birol; Sébastien Boisvert; Jarrod Chapman; Guillaume Chapuis; Rayan Chikhi; Hamidreza Chitsaz; Wen Chi Chou; Jacques Corbeil; Cristian Del Fabbro; Roderick R. Docking; Richard Durbin; Dent Earl; Scott J. Emrich; Pavel Fedotov; Nuno A. Fonseca; Ganeshkumar Ganapathy; Richard A. Gibbs; Sante Gnerre; Élénie Godzaridis; Steve Goldstein; Matthias Haimel; Giles Hall; David Haussler; Joseph Hiatt; Isaac Ho
BackgroundThe process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.ResultsIn Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.ConclusionsMany current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
Cell | 2015
Oren Parnas; Marko Jovanovic; Thomas Eisenhaure; Rebecca H. Herbst; Atray Dixit; Chun Jimmie Ye; Dariusz Przybylski; Randall Jeffrey Platt; Itay Tirosh; Neville E. Sanjana; Ophir Shalem; Rahul Satija; Raktima Raychowdhury; Philipp Mertins; Steven A. Carr; Feng Zhang; Nir Hacohen; Aviv Regev
Finding the components of cellular circuits and determining their functions systematically remains a major challenge in mammalian cells. Here, we introduced genome-wide pooled CRISPR-Cas9 libraries into dendritic cells (DCs) to identify genes that control the induction of tumor necrosis factor (Tnf) by bacterial lipopolysaccharide (LPS), a key process in the host response to pathogens, mediated by the Tlr4 pathway. We found many of the known regulators of Tlr4 signaling, as well as dozens of previously unknown candidates that we validated. By measuring protein markers and mRNA profiles in DCs that are deficient in known or candidate genes, we classified the genes into three functional modules with distinct effects on the canonical responses to LPS and highlighted functions for the PAF complex and oligosaccharyltransferase (OST) complex. Our findings uncover new facets of innate immune circuits in primary cells and provide a genetic approach for dissection of mammalian cell circuits.
Science | 2015
Marko Jovanovic; Michael S. Rooney; Philipp Mertins; Dariusz Przybylski; Nicolas Chevrier; Rahul Satija; Edwin H. Rodriguez; Alexander P. Fields; Schraga Schwartz; Raktima Raychowdhury; Maxwell R. Mumbach; Thomas Eisenhaure; Michal Rabani; Dave Gennert; Diana Lu; Toni Delorey; Jonathan S. Weissman; Steven A. Carr; Nir Hacohen; Aviv Regev
How the immune system readies for battle Although gene expression is tightly controlled at both the RNA and protein levels, the quantitative contribution of each step, especially during dynamic responses, remains largely unknown. Indeed, there has been much debate whether changes in RNA level contribute substantially to protein-level regulation. Jovanovic et al. built a genome-scale model of the temporal dynamics of differential protein expression during the stimulation of immunological dendritic cells (see the Perspective by Li and Biggin). Newly stimulated functions involved the up-regulation of specific RNAs and concomitant increases in the levels of the proteins they encode, whereas housekeeping functions were regulated posttranscriptionally at the protein level. Science, this issue 10.1126/science.1259038; see also p. 1066 Levels of “housekeeping” proteins are maintained directly, but those of immune response proteins depend on more transcription. [Also see Perspective by Li and Biggin] INTRODUCTION Mammalian gene expression is tightly controlled through the interplay between the RNA and protein life cycles. Although studies of individual genes have shown that regulation of each of these processes is important for correct protein expression, the quantitative contribution of each step to changes in protein expression levels remains largely unknown and much debated. Many studies have attempted to address this question in the context of steady-state protein levels, and comparing steady-state RNA and protein abundances has indicated a considerable discrepancy between RNA and protein levels. In contrast, only a few studies have attempted to shed light on how changes in each of these processes determine differential protein expression—either relative (ratios) or absolute (differences)—during dynamic responses, and only one recent report has attempted to quantitate each process. Understanding these contributions to a dynamic response on a systems scale is essential both for deciphering how cells deploy regulatory processes to accomplish physiological changes and for discovering key molecular regulators controlling each process. RATIONALE We developed an integrated experimental and computational strategy to quantitatively assess how protein levels are maintained in the context of a dynamic response and applied it to the model response of mouse immune bone marrow–derived dendritic cells (DCs) to stimulation with lipopolysaccharide (LPS). We used a modified pulsed-SILAC (stable isotope labeling with amino acids in cell culture) approach to track newly synthesized and previously labeled proteins over the first 12 hours of the response. In addition, we independently measured replicate RNA-sequencing profiles under the same conditions. We devised a computational strategy to infer per-mRNA translation rates and protein degradation rates at each time point from the temporal transcriptional profiles and pulsed-SILAC proteomics data. This allowed us to build a genome-scale quantitative model of the temporal dynamics of differential protein expression in DCs responding to LPS. RESULTS We found that before stimulation, mRNA levels contribute to overall protein expression levels more than double the combined contribution of protein translation and degradation rates. Upon LPS stimulation, changes in mRNA abundance play an even more dominant role in dynamic changes in protein levels, especially in immune response genes. Nevertheless, several protein modules—especially the preexisting proteome of proteins performing basic cellular functions—are predominantly regulated in stimulated cells at the level of protein translation or degradation, accounting for over half of the absolute change in protein molecules in the cell. In particular, despite the repression of their transcripts, the level of many proteins in the translational machinery is up-regulated upon LPS stimulation because of significantly increased translation rates, and elevated protein degradation of mitochondrial proteins plays a central role in remodeling cellular energy metabolism. CONCLUSIONS Our results support a model in which the induction of novel cellular functions is primarily driven through transcriptional changes, whereas regulation of protein production or degradation updates the levels of preexisting functions as required for an activated state. Our approach for building quantitative genome-scale models of the temporal dynamics of protein expression is broadly applicable to other dynamic systems. Dynamic protein expression regulation in dendritic cells upon stimulation with LPS. We developed an integrated experimental and computational strategy to quantitatively assess how protein levels are maintained in the context of a dynamic response. Our results support a model in which the induction of novel cellular functions is primarily driven through transcriptional changes, whereas regulation of protein production or degradation updates the levels of preexisting functions. Protein expression is regulated by the production and degradation of messenger RNAs (mRNAs) and proteins, but their specific relationships remain unknown. We combine measurements of protein production and degradation and mRNA dynamics so as to build a quantitative genomic model of the differential regulation of gene expression in lipopolysaccharide-stimulated mouse dendritic cells. Changes in mRNA abundance play a dominant role in determining most dynamic fold changes in protein levels. Conversely, the preexisting proteome of proteins performing basic cellular functions is remodeled primarily through changes in protein production or degradation, accounting for more than half of the absolute change in protein molecules in the cell. Thus, the proteome is regulated by transcriptional induction for newly activated cellular functions and by protein life-cycle changes for remodeling of preexisting functions.
Genome Biology | 2009
Iain MacCallum; Dariusz Przybylski; Sante Gnerre; Joshua N. Burton; Ilya Shlyakhter; Andreas Gnirke; Joel A. Malek; Kevin McKernan; Swati Ranade; Terrance Shea; Louise Williams; Chad Nusbaum; David B. Jaffe
We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).
GigaScience | 2013
Keith Bradnam; Joseph Fass; Anton Alexandrov; Paul Baranay; Michael Bechner; Inanc Birol; Sébastien Boisvert; Jarrod Chapman; Guillaume Chapuis; Rayan Chikhi; Hamidreza Chitsaz; Wen-Chi Chou; Jacques Corbeil; Cristian Del Fabbro; T. Roderick Docking; Richard Durbin; Dent Earl; Scott J. Emrich; Pavel Fedotov; Nuno A. Fonseca; Ganeshkumar Ganapathy; Richard A. Gibbs; Sante Gnerre; Élénie Godzaridis; Steve Goldstein; Matthias Haimel; Giles Hall; David Haussler; Joseph Hiatt; Isaac Ho
BackgroundThe process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.ResultsIn Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.ConclusionsMany current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
Cell Reports | 2014
Dariusz Przybylski; Jennifer Chen; Aviv Regev
Aging is accompanied by physiological impairments, which, in insulin-responsive tissues, including the liver, predispose individuals to metabolic disease. However, the molecular mechanisms underlying these changes remain largely unknown. Here, we analyze genome-wide profiles of RNA and chromatin organization in the liver of young (3 months) and old (21 months) mice. Transcriptional changes suggest that derepression of the nuclear receptors PPARα, PPARγ, and LXRα in aged mouse liver leads to activation of targets regulating lipid synthesis and storage, whereas age-dependent changes in nucleosome occupancy are associated with binding sites for both known regulators (forkhead factors and nuclear receptors) and candidates associated with nuclear lamina (Hdac3 and Srf) implicated to govern metabolic function of aging liver. Winged-helix transcription factor Foxa2 and nuclear receptor corepressor Hdac3 exhibit a reciprocal binding pattern at PPARα targets contributing to gene expression changes that lead to steatosis in aged liver.