Jan Schröder
University of Melbourne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jan Schröder.
Bioinformatics | 2009
Jan Schröder; Heiko Schröder; Simon J. Puglisi; Ranjan Sinha; Bertil Schmidt
MOTIVATION Second-generation sequencing technologies produce a massive amount of short reads in a single experiment. However, sequencing errors can cause major problems when using this approach for de novo sequencing applications. Moreover, existing error correction methods have been designed and optimized for shotgun sequencing. Therefore, there is an urgent need for the design of fast and accurate computational methods and tools for error correction of large amounts of short read data. RESULTS We present SHREC, a new algorithm for correcting errors in short-read data that uses a generalized suffix trie on the read data as the underlying data structure. Our results show that the method can identify erroneous reads with sensitivity and specificity of over 99% and 96% for simulated data with error rates of up to 3% as well as for real data. Furthermore, it achieves an error correction accuracy of over 80% for simulated data and over 88% for real data. These results are clearly superior to previously published approaches. SHREC is available as an efficient open-source Java implementation that allows processing of 10 million of short reads on a standard workstation.
Bioinformatics | 2013
Yongchao Liu; Jan Schröder; Bertil Schmidt
MOTIVATION The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. RESULTS In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement. Our performance evaluation results, in terms of correction quality and de novo genome assembly measures, reveal that Musket is consistently one of the top performing correctors. In addition, Musket is multi-threaded using a master-slave model and demonstrates superior parallel scalability compared with all other evaluated correctors as well as a highly competitive overall execution time. AVAILABILITY Musket is available at http://musket.sourceforge.net.
Cancer Research | 2015
Stephen Q. Wong; Kelly Waldeck; Ismael A. Vergara; Jan Schröder; Jason Madore; James S. Wilmott; Andrew J. Colebatch; De Paoli-Iseppi R; Jason Li; Richard Lupat; Timothy Semple; Gisela Mir Arnau; Andrew Fellowes; Leonard Jh; George Hruby; Graham J. Mann; John F. Thompson; Carleen Cullinane; Meredith L. Johnston; Mark Shackleton; Shahneen Sandhu; David Bowtell; Ricky W. Johnstone; Stephen B. Fox; Grant A. McArthur; Anthony T. Papenfuss; Richard A. Scolyer; Anthony J. Gill; Rodney J. Hicks; Richard W. Tothill
Merkel cell carcinoma (MCC) is an uncommon, but highly malignant, cutaneous tumor. Merkel cell polyoma virus (MCV) has been implicated in a majority of MCC tumors; however, viral-negative tumors have been reported to be more prevalent in some geographic regions subject to high sun exposure. While the impact of MCV and viral T-antigens on MCC development has been extensively investigated, little is known about the etiology of viral-negative tumors. We performed targeted capture and massively parallel DNA sequencing of 619 cancer genes to compare the gene mutations and copy number alterations in MCV-positive (n = 13) and -negative (n = 21) MCC tumors and cell lines. We found that MCV-positive tumors displayed very low mutation rates, but MCV-negative tumors exhibited a high mutation burden associated with a UV-induced DNA damage signature. All viral-negative tumors harbored mutations in RB1, TP53, and a high frequency of mutations in NOTCH1 and FAT1. Additional mutated or amplified cancer genes of potential clinical importance included PI3K (PIK3CA, AKT1, PIK3CG) and MAPK (HRAS, NF1) pathway members and the receptor tyrosine kinase FGFR2. Furthermore, looking ahead to potential therapeutic strategies encompassing immune checkpoint inhibitors such as anti-PD-L1, we also assessed the status of T-cell-infiltrating lymphocytes (TIL) and PD-L1 in MCC tumors. A subset of viral-negative tumors exhibited high TILs and PD-L1 expression, corresponding with the higher mutation load within these cancers. Taken together, this study provides new insights into the underlying biology of viral-negative MCC and paves the road for further investigation into new treatment opportunities.
Cancer Cell | 2014
Dale W. Garsed; Owen J. Marshall; Vincent Corbin; Arthur L. Hsu; Leon Di Stefano; Jan Schröder; Jason Li; Zhi-Ping Feng; Bo W. Kim; Mark Kowarsky; Ben Lansdell; Ross Brookwell; Ola Myklebost; Leonardo A. Meza-Zepeda; Andrew J. Holloway; Florence Pedeutour; K.H. Andy Choo; Michael A. Damore; Andrew J. Deans; Anthony T. Papenfuss; David Thomas
We isolated and analyzed, at single-nucleotide resolution, cancer-associated neochromosomes from well- and/or dedifferentiated liposarcomas. Neochromosomes, which can exceed 600 Mb in size, initially arise as circular structures following chromothripsis involving chromosome 12. The core of the neochromosome is amplified, rearranged, and corroded through hundreds of breakage-fusion-bridge cycles. Under selective pressure, amplified oncogenes are overexpressed, while coamplified passenger genes may be silenced epigenetically. New material may be captured during punctuated chromothriptic events. Centromeric corrosion leads to crisis, which is resolved through neocentromere formation or native centromere capture. Finally, amplification terminates, and the neochromosome core is stabilized in linear form by telomere capture. This study investigates the dynamic mutational processes underlying the life history of a special form of cancer mutation.
PLOS ONE | 2010
Jan Schröder; James Bailey; Thomas C. Conway; Justin Zobel
Background High-throughput DNA sequencing techniques offer the ability to rapidly and cheaply sequence material such as whole genomes. However, the short-read data produced by these techniques can be biased or compromised at several stages in the sequencing process; the sources and properties of some of these biases are not always known. Accurate assessment of bias is required for experimental quality control, genome assembly, and interpretation of coverage results. An additional challenge is that, for new genomes or material from an unidentified source, there may be no reference available against which the reads can be checked. Results We propose analytical methods for identifying biases in a collection of short reads, without recourse to a reference. These, in conjunction with existing approaches, comprise a methodology that can be used to quantify the quality of a set of reads. Our methods involve use of three different measures: analysis of base calls; analysis of k-mers; and analysis of distributions of k-mers. We apply our methodology to wide range of short read data and show that, surprisingly, strong biases appear to be present. These include gross overrepresentation of some poly-base sequences, per-position biases towards some bases, and apparent preferences for some starting positions over others. Conclusions The existence of biases in short read data is known, but they appear to be greater and more diverse than identified in previous literature. Statistical analysis of a set of short reads can help identify issues prior to assembly or resequencing, and should help guide chemical or statistical methods for bias rectification.
Plant Molecular Biology | 2001
Jan Schröder; Heiko Stenger; Wolfgang Wernicke
Intricate changes in the patterns of the cytoskeleton, especially of microtubules, appear to control the establishment of complex plant cell shapes. Little is known about how these changes are accomplished. The objective of the present study was to test whether or not α-tubulin genes are differentially expressed during cell shaping in growing leaves of barley (Hordeum vulgare L.). Five α-tubulin genes representing at least most members of the gene family were found to be expressed in the leaf. Dot-blot analyses revealed expression patterns that could be classified into three groups. Two isotypes (HVATUB2 and HVATUB4) were maximally expressed in the meristem with a steady decline during the differentiation process (1). One isotype (HVATUB3) appeared to be constitutively expressed during cell shaping, although strongest signals were found during late stages, before the general decline in microtubular activity (2). The most striking finding was that two types (HVATUB1 and HVATUB5) were almost exclusively expressed in early post-mitotic cells, when transverse microtubular bundles determining the future cell shape in the mesophyll are formed (3). Relative transcript abundance was highest in HVATUB2 and HVATUB3, whereas the transcript level of the only transiently expressed HVATUB5 was very low, even during its phase of maximum expression. The results are discussed in the context of the general debate relating to the significance of multiple tubulin isotypes.
BMC Bioinformatics | 2014
Adrianto Wirawan; Robert S. Harris; Yongchao Liu; Bertil Schmidt; Jan Schröder
BackgroundCurrent-generation sequencing technologies are able to produce low-cost, high-throughput reads. However, the produced reads are imperfect and may contain various sequencing errors. Although many error correction methods have been developed in recent years, none explicitly targets homopolymer-length errors in the 454 sequencing reads.ResultsWe present HECTOR, a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. In this algorithm, for the first time we have investigated a novel homopolymer spectrum based approach to handle homopolymer insertions or deletions, which are the dominant sequencing errors in 454 pyrosequencing reads. We have evaluated the performance of HECTOR, in terms of correction quality, runtime and parallel scalability, using both simulated and real pyrosequencing datasets. This performance has been further compared to that of Coral, a state-of-the-art error corrector which is based on multiple sequence alignment and Acacia, a recently published error corrector for amplicon pyrosequences. Our evaluations reveal that HECTOR demonstrates comparable correction quality to Coral, but runs 3.7× faster on average. In addition, HECTOR performs well even when the coverage of the dataset is low.ConclusionOur homopolymer spectrum based approach is theoretically capable of processing arbitrary-length homopolymer-length errors, with a linear time complexity. HECTOR employs a multi-threaded design based on a master-slave computing model. Our experimental results show that HECTOR is a practical 454 pyrosequencing read error corrector which is competitive in terms of both correction quality and speed. The source code and all simulated data are available at: http://hector454.sourceforge.net.
Journal of Biological Chemistry | 2017
James M. McCoy; Rebecca J. Stewart; Alessandro D. Uboldi; Dongdi Li; Jan Schröder; Nichollas E. Scott; Anthony T. Papenfuss; Adele M. Lehane; Leonard J. Foster; Christopher J. Tonkin
Toxoplasma gondii, like all apicomplexan parasites, uses Ca2+ signaling pathways to activate gliding motility to power tissue dissemination and host cell invasion and egress. A group of “plant-like” Ca2+-dependent protein kinases (CDPKs) transduces cytosolic Ca2+ flux into enzymatic activity, but how they function is poorly understood. To investigate how Ca2+ signaling activates egress through CDPKs, we performed a forward genetic screen to isolate gain-of-function mutants from an egress-deficient cdpk3 knockout strain. We recovered mutants that regained the ability to egress from host cells that harbored mutations in the gene Suppressor of Ca2+-dependent Egress 1 (SCE1). Global phosphoproteomic analysis showed that SCE1 deletion restored many Δcdpk3-dependent phosphorylation events to near wild-type levels. We also show that CDPK3-dependent SCE1 phosphorylation is required to relieve its suppressive activity to potentiate egress. In summary, our work has uncovered a novel component and suppressor of Ca2+-dependent cell egress during Toxoplasma lytic growth.
PLOS ONE | 2015
Jan Schröder; Santhosh Girirajan; Anthony T. Papenfuss; Paul Medvedev
The uses of the Genome Reference Consortium’s human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate system for identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy. We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphism in a population.
PLOS ONE | 2015
Carmen S M Yong; Janelle Sharkey; Belinda Duscio; Ben Venville; Wei-Zen Wei; Richard F. Jones; Clare Y. Slaney; Gisela Mir Arnau; Anthony T. Papenfuss; Jan Schröder; Phillip K. Darcy; Michael H. Kershaw
The development of antigen-targeted therapeutics is dependent on the preferential expression of tumor-associated antigens (TAA) at targetable levels on the tumor. Tumor-associated antigens can be generated de novo or can arise from altered expression of normal basal proteins, such as the up-regulation of human epidermal growth factor receptor 2 (Her2/ErbB2). To properly assess the development of Her2 therapeutics in an immune tolerant model, we previously generated a transgenic mouse model in which expression of the human Her2 protein was present in both the brain and mammary tissue. This mouse model has facilitated the development of Her2 targeted therapies in a clinically relevant and suitable model. While heterozygous Her2+/- mice appear to develop in a similar manner to wild type mice (Her2-/-), it has proven difficult to generate homozygous Her2+/+ mice, potentially due to embryonic lethality. In this study, we performed whole genome sequencing to determine if the integration site of the Her2 transgene was responsible for this lethality. Indeed, we report that the Her2 transgene had integrated into the Pds5b (precocious dissociation of sisters) gene on chromosome 5, as a 162 copy concatemer. Furthermore, our findings demonstrate that Her2+/+ mice, similar to Pds5b-/- mice, are embryonic lethal and confirm the necessity for Pds5b in embryonic development. This study confirms the value of whole genome sequencing in determining the integration site of transgenes to gain insight into associated phenotypes.