Jacqueline E. Schein | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jacqueline E. Schein is active.

Explore More

Publication

Featured researches published by Jacqueline E. Schein.

Genome Research | 2009

Circos: An information aesthetic for comparative genomics

Martin Krzywinski; Jacqueline E. Schein; Inanc Birol; Joseph M. Connors; Randy D. Gascoyne; Doug Horsman; Steven J.M. Jones; Marco A. Marra

We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

Science | 2006

The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray)

Gerald A. Tuskan; Stephen P. DiFazio; Stefan Jansson; Joerg Bohlmann; Igor V. Grigoriev; Uffe Hellsten; Nik Putnam; Steven Ralph; Stephane Rombauts; Asaf Salamov; Jacqueline E. Schein; Lieven Sterck; Andrea Aerts

We report the draft genome of the black cottonwood tree, Populus trichocarpa. Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs of duplicated genes from that event survived in the Populus genome. A second, older duplication event is indistinguishably coincident with the divergence of the Populus and Arabidopsis lineages. Nucleotide substitution, tandem gene duplication, and gross chromosomal rearrangement appear to proceed substantially more slowly in Populus than in Arabidopsis. Populus has more protein-coding genes than Arabidopsis, ranging on average from 1.4 to 1.6 putative Populus homologs for each Arabidopsis gene. However, the relative frequency of protein domains in the two genomes is similar. Overrepresented exceptions in Populus include genes associated with lignocellulosic wall biosynthesis, meristem development, disease resistance, and metabolite transport.

Genome Research | 2009

ABySS: A parallel assembler for short read sequence data

Jared T. Simpson; Kim Wong; Shaun D. Jackman; Jacqueline E. Schein; Steven J.M. Jones; Inanc Birol

Widespread adoption of massively parallel deoxyribonucleic acid (DNA) sequencing instruments has prompted the recent development of de novo short read assembly algorithms. A common shortcoming of the available tools is their inability to efficiently assemble vast amounts of data generated from large-scale sequencing projects, such as the sequencing of individual human genomes to catalog natural genetic variation. To address this limitation, we developed ABySS (Assembly By Short Sequences), a parallelized sequence assembler. As a demonstration of the capability of our software, we assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc. Approximately 2.76 million contigs > or =100 base pairs (bp) in length were created with an N50 size of 1499 bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and to other primate genomes.

Proceedings of the National Academy of Sciences of the United States of America | 2002

Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.

Robert L. Strausberg; Elise A. Feingold; Lynette H. Grouse; Jeffery G. Derge; Richard D. Klausner; Francis S. Collins; Lukas Wagner; Carolyn M. Shenmen; Gregory D. Schuler; Stephen F. Altschul; Barry R. Zeeberg; Kenneth H. Buetow; Carl F. Schaefer; Narayan K. Bhat; Ralph F. Hopkins; Heather Jordan; Troy Moore; Steve I. Max; Jun Wang; Florence Hsieh; Luda Diatchenko; Kate Marusina; Andrew A. Farmer; Gerald M. Rubin; Ling Hong; Mark Stapleton; M. Bento Soares; Maria F. Bonaldo; Tom L. Casavant; Todd E. Scheetz

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http://mgc.nci.nih.gov).

Nature Genetics | 2010

Somatic mutations altering EZH2 (Tyr641) in follicular and diffuse large B-cell lymphomas of germinal-center origin

Ryan D. Morin; Nathalie A. Johnson; Tesa Severson; Andrew J. Mungall; Jianghong An; Rodrigo Goya; Jessica E. Paul; Merrill Boyle; Bruce Woolcock; Florian Kuchenbauer; Damian Yap; R. Keith Humphries; Obi L. Griffith; Sohrab P. Shah; Henry Zhu; Michelle Kimbara; Pavel Shashkin; Jean F Charlot; Marianna Tcherpakov; Richard Corbett; Angela Tam; Richard Varhol; Duane E. Smailus; Michelle Moksa; Yongjun Zhao; Allen Delaney; Hong Qian; Inanc Birol; Jacqueline E. Schein; Richard A. Moore

Follicular lymphoma (FL) and the GCB subtype of diffuse large B-cell lymphoma (DLBCL) derive from germinal center B cells. Targeted resequencing studies have revealed mutations in various genes encoding proteins in the NF-κB pathway that contribute to the activated B-cell (ABC) DLBCL subtype, but thus far few GCB-specific mutations have been identified. Here we report recurrent somatic mutations affecting the polycomb-group oncogene EZH2, which encodes a histone methyltransferase responsible for trimethylating Lys27 of histone H3 (H3K27). After the recent discovery of mutations in KDM6A (UTX), which encodes the histone H3K27me3 demethylase UTX, in several cancer types, EZH2 is the second histone methyltransferase gene found to be mutated in cancer. These mutations, which result in the replacement of a single tyrosine in the SET domain of the EZH2 protein (Tyr641), occur in 21.7% of GCB DLBCLs and 7.2% of FLs and are absent from ABC DLBCLs. Our data are consistent with the notion that EZH2 proteins with mutant Tyr641 have reduced enzymatic activity in vitro.

Nature | 2011

Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma

Ryan D. Morin; Maria Mendez-Lago; Andrew J. Mungall; Rodrigo Goya; Karen Mungall; Richard Corbett; Nathalie A. Johnson; Tesa Severson; Readman Chiu; Matthew A. Field; Shaun D. Jackman; Martin Krzywinski; David W. Scott; Diane L. Trinh; Jessica Tamura-Wells; Sa Li; Marlo Firme; Sanja Rogic; Malachi Griffith; Susanna Chan; Oleksandr Yakovenko; Irmtraud M. Meyer; Eric Zhao; Duane E. Smailus; Michelle Moksa; Lisa M. Rimsza; Angela Brooks-Wilson; John J. Spinelli; Susana Ben-Neriah; Barbara Meissner

Follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL) are the two most common non-Hodgkin lymphomas (NHLs). Here we sequenced tumour and matched normal DNA from 13 DLBCL cases and one FL case to identify genes with mutations in B-cell NHL. We analysed RNA-seq data from these and another 113 NHLs to identify genes with candidate mutations, and then re-sequenced tumour and matched normal DNA from these cases to confirm 109 genes with multiple somatic mutations. Genes with roles in histone modification were frequent targets of somatic mutation. For example, 32% of DLBCL and 89% of FL cases had somatic mutations in MLL2, which encodes a histone methyltransferase, and 11.4% and 13.4% of DLBCL and FL cases, respectively, had mutations in MEF2B, a calcium-regulated gene that cooperates with CREBBP and EP300 in acetylating histones. Our analysis suggests a previously unappreciated disruption of chromatin biology in lymphomagenesis.

PLOS Biology | 2003

The genome sequence of Caenorhabditis briggsae: A platform for comparative genomics

Lincoln Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R. Brent; Nansheng Chen; Asif T. Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter D'Eustachio; David H. A. Fitch; Lucinda A. Fulton; Robert Fulton; Sam Griffiths-Jones; Todd W. Harris; LaDeana W. Hillier; Ravi S. Kamath; Patricia E. Kuwabara; Elaine R. Mardis; Marco A. Marra; Tracie L. Miner; Patrick Minx; James C. Mullikin; Robert W. Plumb; Jane Rogers; Jacqueline E. Schein; Marc Sohrmann; John Spieth

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.

Nature Methods | 2010

De novo assembly and analysis of RNA-seq data

Gordon Robertson; Jacqueline E. Schein; Readman Chiu; Richard Corbett; Matthew A. Field; Shaun D. Jackman; Karen Mungall; Sam Lee; Hisanaga Mark Okada; Jenny Q. Qian; Malachi Griffith; Anthony Raymond; Nina Thiessen; Timothee Cezard; Yaron S N Butterfield; Richard Newsome; Simon K. Chan; Rong She; Richard Varhol; Baljit Kamoh; Anna-Liisa Prabhu; Angela Tam; Yongjun Zhao; Richard A. Moore; Martin Hirst; Marco A. Marra; Steven J.M. Jones; Pamela A. Hoodless; Inanc Birol

We describe Trans-ABySS, a de novo short-read transcriptome assembly and analysis pipeline that addresses variation in local read densities by assembling read substrings with varying stringencies and then merging the resulting contigs before analysis. Analyzing 7.4 gigabases of 50-base-pair paired-end Illumina reads from an adult mouse liver poly(A) RNA library, we identified known, new and alternative structures in expressed transcripts, and achieved high sensitivity and specificity relative to reference-based assembly methods.

Proceedings of the National Academy of Sciences of the United States of America | 2006

The complete genome of Rhodococcus sp. RHA1 provides insights into a catabolic powerhouse

Michael P. McLeod; René L. Warren; William W. L. Hsiao; Naoto Araki; Matthew Myhre; Clinton Fernandes; Daisuke Miyazawa; Wendy Wong; Anita L. Lillquist; Dennis Wang; Manisha Dosanjh; Hirofumi Hara; Anca Petrescu; Ryan D. Morin; George P. Yang; Jeff M. Stott; Jacqueline E. Schein; Heesun Shin; Duane E. Smailus; Asim Siddiqui; Marco A. Marra; Steven J.M. Jones; Robert A. Holt; Fiona S. L. Brinkman; Keisuke Miyauchi; Masao Fukuda; Julian Davies; William W. Mohn; Lindsay D. Eltis

Rhodococcus sp. RHA1 (RHA1) is a potent polychlorinated biphenyl-degrading soil actinomycete that catabolizes a wide range of compounds and represents a genus of considerable industrial interest. RHA1 has one of the largest bacterial genomes sequenced to date, comprising 9,702,737 bp (67% G+C) arranged in a linear chromosome and three linear plasmids. A targeted insertion methodology was developed to determine the telomeric sequences. RHA1s 9,145 predicted protein-encoding genes are exceptionally rich in oxygenases (203) and ligases (192). Many of the oxygenases occur in the numerous pathways predicted to degrade aromatic compounds (30) or steroids (4). RHA1 also contains 24 nonribosomal peptide synthase genes, six of which exceed 25 kbp, and seven polyketide synthase genes, providing evidence that rhodococci harbor an extensive secondary metabolism. Among sequenced genomes, RHA1 is most similar to those of nocardial and mycobacterial strains. The genome contains few recent gene duplications. Moreover, three different analyses indicate that RHA1 has acquired fewer genes by recent horizontal transfer than most bacteria characterized to date and far fewer than Burkholderia xenovorans LB400, whose genome size and catabolic versatility rival those of RHA1. RHA1 and LB400 thus appear to demonstrate that ecologically similar bacteria can evolve large genomes by different means. Overall, RHA1 appears to have evolved to simultaneously catabolize a diverse range of plant-derived compounds in an O2-rich environment. In addition to establishing RHA1 as an important model for studying actinomycete physiology, this study provides critical insights that facilitate the exploitation of these industrially important microorganisms.

Proceedings of the National Academy of Sciences of the United States of America | 2011

Obligate biotrophy features unraveled by the genomic analysis of rust fungi

Sébastien Duplessis; Christina A. Cuomo; Yao-Cheng Lin; Andrea Aerts; Emilie Tisserant; Claire Veneault-Fourrey; David L. Joly; Stéphane Hacquard; Joelle Amselem; Brandi L. Cantarel; Readman Chiu; Pedro M. Coutinho; Nicolas Feau; Matthew A. Field; Pascal Frey; Eric Gelhaye; Jonathan M. Goldberg; Manfred Grabherr; Chinnappa D. Kodira; Annegret Kohler; Ursula Kües; Erika Lindquist; Susan Lucas; Rohit Mago; Evan Mauceli; Emmanuelle Morin; Claude Murat; Jasmyn Pangilinan; Robert F. Park; Matthew Pearson

Rust fungi are some of the most devastating pathogens of crop plants. They are obligate biotrophs, which extract nutrients only from living plant tissues and cannot grow apart from their hosts. Their lifestyle has slowed the dissection of molecular mechanisms underlying host invasion and avoidance or suppression of plant innate immunity. We sequenced the 101-Mb genome of Melampsora larici-populina, the causal agent of poplar leaf rust, and the 89-Mb genome of Puccinia graminis f. sp. tritici, the causal agent of wheat and barley stem rust. We then compared the 16,399 predicted proteins of M. larici-populina with the 17,773 predicted proteins of P. graminis f. sp tritici. Genomic features related to their obligate biotrophic lifestyle include expanded lineage-specific gene families, a large repertoire of effector-like small secreted proteins, impaired nitrogen and sulfur assimilation pathways, and expanded families of amino acid and oligopeptide membrane transporters. The dramatic up-regulation of transcripts coding for small secreted proteins, secreted hydrolytic enzymes, and transporters in planta suggests that they play a role in host infection and nutrient acquisition. Some of these genomic hallmarks are mirrored in the genomes of other microbial eukaryotes that have independently evolved to infect plants, indicating convergent adaptation to a biotrophic existence inside plant cells.

Explore More