Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas D. Wu is active.

Publication


Featured researches published by Thomas D. Wu.


Bioinformatics | 2010

Fast and SNP-tolerant detection of complex variants and splicing in short reads

Thomas D. Wu; Serban Nacu

Motivation: Next-generation sequencing captures sequence differences in reads relative to a reference genome or transcriptome, including splicing events and complex variants involving multiple mismatches and long indels. We present computational methods for fast detection of complex variants and splicing in short reads, based on a successively constrained search process of merging and filtering position lists from a genomic index. Our methods are implemented in GSNAP (Genomic Short-read Nucleotide Alignment Program), which can align both single- and paired-end reads as short as 14 nt and of arbitrarily long length. It can detect short- and long-distance splicing, including interchromosomal splicing, in individual reads, using probabilistic models or a database of known splice sites. Our program also permits SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite-treated DNA for the study of methylation state. Results: In comparison testing, GSNAP has speeds comparable to existing programs, especially in reads of ≥70 nt and is fastest in detecting complex variants with four or more mismatches or insertions of 1–9 nt and deletions of 1–30 nt. Although SNP tolerance does not increase alignment yield substantially, it affects alignment results in 7–8% of transcriptional reads, typically by revealing alternate genomic mappings for a read. Simulations of bisulfite-converted DNA show a decrease in identifying genomic positions uniquely in 6% of 36 nt reads and 3% of 70 nt reads. Availability: Source code in C and utility programs in Perl are freely available for download as part of the GMAP package at http://share.gene.com/gmap. Contact: [email protected]


Bioinformatics | 2005

GMAP: a genomic mapping and alignment program for mRNA and EST sequences

Thomas D. Wu; Colin K. Watanabe

MOTIVATION We introduce GMAP, a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing. RESULTS On a set of human messenger RNAs with random mutations at a 1 and 3% rate, GMAP identified all splice sites accurately in over 99.3% of the sequences, which was one-tenth the error rate of existing programs. On a large set of human expressed sequence tags, GMAP provided higher-quality alignments more often than blat did. On a set of Arabidopsis cDNAs, GMAP performed comparably with GeneSeqer. In these experiments, GMAP demonstrated a several-fold increase in speed over existing programs. AVAILABILITY Source code for gmap and associated programs is available at http://www.gene.com/share/gmap SUPPLEMENTARY INFORMATION http://www.gene.com/share/gmap.


Nature | 2012

Recurrent R-spondin fusions in colon cancer

Somasekar Seshagiri; Eric Stawiski; Steffen Durinck; Zora Modrusan; Elaine E. Storm; Caitlin B. Conboy; Subhra Chaudhuri; Yinghui Guan; Vasantharajan Janakiraman; Bijay S. Jaiswal; Joseph Guillory; Connie Ha; Gerrit J. P. Dijkgraaf; Jeremy Stinson; Florian Gnad; Melanie A. Huntley; Jeremiah D. Degenhardt; Peter M. Haverty; Richard Bourgon; Weiru Wang; Hartmut Koeppen; Robert Gentleman; Timothy K. Starr; Zemin Zhang; David A. Largaespada; Thomas D. Wu; Frederic J. de Sauvage

Identifying and understanding changes in cancer genomes is essential for the development of targeted therapeutics. Here we analyse systematically more than 70 pairs of primary human colon tumours by applying next-generation sequencing to characterize their exomes, transcriptomes and copy-number alterations. We have identified 36,303 protein-altering somatic changes that include several new recurrent mutations in the Wnt pathway gene TCF7L2, chromatin-remodelling genes such as TET2 and TET3 and receptor tyrosine kinases including ERBB3. Our analysis for significantly mutated cancer genes identified 23 candidates, including the cell cycle checkpoint kinase ATM. Copy-number and RNA-seq data analysis identified amplifications and corresponding overexpression of IGF2 in a subset of colon tumours. Furthermore, using RNA-seq data we identified multiple fusion transcripts including recurrent gene fusions involving R-spondin family members RSPO2 and RSPO3 that together occur in 10% of colon tumours. The RSPO fusions were mutually exclusive with APC mutations, indicating that they probably have a role in the activation of Wnt signalling and tumorigenesis. Consistent with this we show that the RSPO fusion proteins were capable of potentiating Wnt signalling. The R-spondin gene fusions and several other gene mutations identified in this study provide new potential opportunities for therapeutic intervention in colon cancer.


Nature Genetics | 2012

Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer

Charles M. Rudin; Steffen Durinck; Eric Stawiski; John T. Poirier; Zora Modrusan; David S. Shames; Emily Bergbower; Yinghui Guan; James Shin; Joseph Guillory; Celina Sanchez Rivers; Catherine K. Foo; Deepali Bhatt; Jeremy Stinson; Florian Gnad; Peter M. Haverty; Robert Gentleman; Subhra Chaudhuri; Vasantharajan Janakiraman; Bijay S. Jaiswal; Chaitali Parikh; Wenlin Yuan; Zemin Zhang; Hartmut Koeppen; Thomas D. Wu; Howard M. Stern; Robert L. Yauch; Kenneth Huffman; Diego D Paskulin; Peter B. Illei

Small-cell lung cancer (SCLC) is an exceptionally aggressive disease with poor prognosis. Here, we obtained exome, transcriptome and copy-number alteration data from approximately 53 samples consisting of 36 primary human SCLC and normal tissue pairs and 17 matched SCLC and lymphoblastoid cell lines. We also obtained data for 4 primary tumors and 23 SCLC cell lines. We identified 22 significantly mutated genes in SCLC, including genes encoding kinases, G protein–coupled receptors and chromatin-modifying proteins. We found that several members of the SOX family of genes were mutated in SCLC. We also found SOX2 amplification in ∼27% of the samples. Suppression of SOX2 using shRNAs blocked proliferation of SOX2-amplified SCLC lines. RNA sequencing identified multiple fusion transcripts and a recurrent RLF-MYCL1 fusion. Silencing of MYCL1 in SCLC cell lines that had the RLF-MYCL1 fusion decreased cell proliferation. These data provide an in-depth view of the spectrum of genomic alterations in SCLC and identify several potential targets for therapeutic intervention.


Nature | 2010

Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis

Sergio E. Baranzini; Joann Mudge; Jennifer C. van Velkinburgh; Pouya Khankhanian; Irina Khrebtukova; Neil Miller; Lu Zhang; Andrew D. Farmer; Callum J. Bell; Ryan W. Kim; Gregory D. May; Jimmy E. Woodward; Stacy J. Caillier; Joseph P. McElroy; Refujia Gomez; Marcelo J. Pando; Leonda E. Clendenen; Elena E. Ganusova; Faye D. Schilkey; Thiruvarangan Ramaraj; Omar Khan; Jim J. Huntley; Shujun Luo; Pui-Yan Kwok; Thomas D. Wu; Gary P. Schroth; Jorge R. Oksenberg; Stephen L. Hauser; Stephen F. Kingsmore

Monozygotic or ‘identical’ twins have been widely studied to dissect the relative contributions of genetics and environment in human diseases. In multiple sclerosis (MS), an autoimmune demyelinating disease and common cause of neurodegeneration and disability in young adults, disease discordance in monozygotic twins has been interpreted to indicate environmental importance in its pathogenesis. However, genetic and epigenetic differences between monozygotic twins have been described, challenging the accepted experimental model in disambiguating the effects of nature and nurture. Here we report the genome sequences of one MS-discordant monozygotic twin pair, and messenger RNA transcriptome and epigenome sequences of CD4+ lymphocytes from three MS-discordant, monozygotic twin pairs. No reproducible differences were detected between co-twins among ∼3.6 million single nucleotide polymorphisms (SNPs) or ∼0.2 million insertion-deletion polymorphisms. Nor were any reproducible differences observed between siblings of the three twin pairs in HLA haplotypes, confirmed MS-susceptibility SNPs, copy number variations, mRNA and genomic SNP and insertion-deletion genotypes, or the expression of ∼19,000 genes in CD4+ T cells. Only 2 to 176 differences in the methylation of ∼2 million CpG dinucleotides were detected between siblings of the three twin pairs, in contrast to ∼800 methylation differences between T cells of unrelated individuals and several thousand differences between tissues or between normal and cancerous tissues. In the first systematic effort to estimate sequence variation among monozygotic co-twins, we did not find evidence for genetic, epigenetic or transcriptome differences that explained disease discordance. These are the first, to our knowledge, female, twin and autoimmune disease individual genome sequences reported.


Journal of Virology | 2003

Mutation Patterns and Structural Correlates in Human Immunodeficiency Virus Type 1 Protease following Different Protease Inhibitor Treatments

Thomas D. Wu; Celia A. Schiffer; Jonathan Taylor; Rami Kantor; Sunwen Chou; Dennis Israelski; Andrew R. Zolopa; W. Jeffrey Fessel; Robert W. Shafer

ABSTRACT Although many human immunodeficiency virus type 1 (HIV-1)-infected persons are treated with multiple protease inhibitors in combination or in succession, mutation patterns of protease isolates from these persons have not been characterized. We collected and analyzed 2,244 subtype B HIV-1 isolates from 1,919 persons with different protease inhibitor experiences: 1,004 isolates from untreated persons, 637 isolates from persons who received one protease inhibitor, and 603 isolates from persons receiving two or more protease inhibitors. The median number of protease mutations per isolate increased from 4 in untreated persons to 12 in persons who had received four or more protease inhibitors. Mutations at 45 of the 99 amino acid positions in the protease—including 22 not previously associated with drug resistance—were significantly associated with protease inhibitor treatment. Mutations at 17 of the remaining 99 positions were polymorphic but not associated with drug treatment. Pairs and clusters of correlated (covarying) mutations were significantly more likely to occur in treated than in untreated persons: 115 versus 23 pairs and 30 versus 2 clusters, respectively. Of the 115 statistically significant pairs of covarying residues in the treated isolates, 59 were within 8 Å of each other—many more than would be expected by chance. In summary, nearly one-half of HIV-1 protease positions are under selective drug pressure, including many residues not previously associated with drug resistance. Structural factors appear to be responsible for the high frequency of covariation among many of the protease residues. The presence of mutational clusters provides insight into the complex mutational patterns required for HIV-1 protease inhibitor resistance.


Nature Biotechnology | 2015

A comprehensive transcriptional portrait of human cancer cell lines

Christiaan Klijn; Steffen Durinck; Eric Stawiski; Peter M. Haverty; Zhaoshi Jiang; Hanbin Liu; Jeremiah D. Degenhardt; Oleg Mayba; Florian Gnad; Jinfeng Liu; Gregoire Pau; Jens Reeder; Yi Cao; Kiran Mukhyala; Suresh Selvaraj; Mamie Yu; Gregory J Zynda; Matthew J. Brauer; Thomas D. Wu; Robert Gentleman; Gerard Manning; Robert L. Yauch; Richard Bourgon; David Stokoe; Zora Modrusan; Richard M. Neve; Frederic J. de Sauvage; Jeffrey Settleman; Somasekar Seshagiri; Zemin Zhang

Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.


The Journal of Pathology | 2001

Analysing gene expression data from DNA microarrays to identify candidate genes

Thomas D. Wu

Microarray data analysis can be divided into two tasks: grouping of genes to discover broad patterns of biological behaviour, and filtering of genes to identify specific genes of interest. Whereas the gene‐grouping task is largely addressed by cluster analysis, the gene‐filtering task relies primarily on hypothesis testing. This review article surveys analytical methods for the gene‐filtering task. Various types of data analysis are discussed for four basic types of experimental protocols: a comparison of two biological samples; a comparison of two biological conditions; each represented by a set of replicate samples; a comparison of multiple biological conditions; and analysis of covariate information. Copyright


Molecular Cancer Research | 2009

Genetic Alterations and Oncogenic Pathways Associated with Breast Cancer Subtypes

Xiaolan Hu; Howard M. Stern; Lin Ge; Carol O'Brien; Lauren Haydu; Cynthia Honchell; Peter M. Haverty; Brock A. Peters; Thomas D. Wu; Lukas C. Amler; John Chant; David Stokoe; Mark R. Lackner; Guy Cavet

Breast cancers can be divided into subtypes with important implications for prognosis and treatment. We set out to characterize the genetic alterations observed in different breast cancer subtypes and to identify specific candidate genes and pathways associated with subtype biology. mRNA expression levels of estrogen receptor, progesterone receptor, and HER2 were shown to predict marker status determined by immunohistochemistry and to be effective at assigning samples to subtypes. HER2+ cancers were shown to have the greatest frequency of high-level amplification (independent of the ERBB2 amplicon itself), but triple-negative cancers had the highest overall frequencies of copy gain. Triple-negative cancers also were shown to have more frequent loss of phosphatase and tensin homologue and mutation of RB1, which may contribute to genomic instability. We identified and validated seven regions of copy number alteration associated with different subtypes, and used integrative bioinformatics analysis to identify candidate oncogenes and tumor suppressors, including ERBB2, GRB7, MYST2, PPM1D, CCND1, HDAC2, FOXA1, and RASA1. We tested the candidate oncogene MYST2 and showed that it enhances the anchorage-independent growth of breast cancer cells. The genome-wide and region-specific differences between subtypes suggest the differential activation of oncogenic pathways. (Mol Cancer Res 2009;7(4):511–22)


Nature Genetics | 2016

Comprehensive genomic analysis of malignant pleural mesothelioma identifies recurrent mutations, gene fusions and splicing alterations

Raphael Bueno; Eric Stawiski; Leonard D. Goldstein; Steffen Durinck; Assunta De Rienzo; Zora Modrusan; Florian Gnad; Thong T. Nguyen; Bijay S. Jaiswal; Lucian R. Chirieac; Daniele Sciaranghella; Nhien Dao; Corinne E. Gustafson; Kiara J. Munir; Jason A. Hackney; Amitabha Chaudhuri; Ravi Gupta; Joseph Guillory; Karen Toy; Connie Ha; Ying-Jiun Chen; Jeremy Stinson; Subhra Chaudhuri; Na Zhang; Thomas D. Wu; David J. Sugarbaker; Frederic J. de Sauvage; William G. Richards; Somasekar Seshagiri

We analyzed transcriptomes (n = 211), whole exomes (n = 99) and targeted exomes (n = 103) from 216 malignant pleural mesothelioma (MPM) tumors. Using RNA-seq data, we identified four distinct molecular subtypes: sarcomatoid, epithelioid, biphasic-epithelioid (biphasic-E) and biphasic-sarcomatoid (biphasic-S). Through exome analysis, we found BAP1, NF2, TP53, SETD2, DDX3X, ULK2, RYR2, CFAP45, SETDB1 and DDX51 to be significantly mutated (q-score ≥ 0.8) in MPMs. We identified recurrent mutations in several genes, including SF3B1 (∼2%; 4/216) and TRAF7 (∼2%; 5/216). SF3B1-mutant samples showed a splicing profile distinct from that of wild-type tumors. TRAF7 alterations occurred primarily in the WD40 domain and were, except in one case, mutually exclusive with NF2 alterations. We found recurrent gene fusions and splice alterations to be frequent mechanisms for inactivation of NF2, BAP1 and SETD2. Through integrated analyses, we identified alterations in Hippo, mTOR, histone methylation, RNA helicase and p53 signaling pathways in MPMs.

Collaboration


Dive into the Thomas D. Wu's collaboration.

Researchain Logo
Decentralizing Knowledge