Saskia Hiltemann
Erasmus University Medical Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Saskia Hiltemann.
GigaScience | 2014
Saskia Hiltemann; Hailiang Mei; Mattias de Hollander; Ivo Palli; Peter J. van der Spek; Guido Jenster; Andrew Stubbs
BackgroundComplete Genomics provides an open-source suite of command-line tools for the analysis of their CG-formatted mapped sequencing files. Determination of; for example, the functional impact of detected variants, requires annotation with various databases that often require command-line and/or programming experience; thus, limiting their use to the average research scientist. We have therefore implemented this CG toolkit, together with a number of annotation, visualisation and file manipulation tools in Galaxy called CGtag (Complete Genomics Toolkit and Annotation in a Cloud-based Galaxy).FindingsIn order to provide research scientists with web-based, simple and accurate analytical and visualisation applications for the selection of candidate mutations from Complete Genomics data, we have implemented the open-source Complete Genomics tool set, CGATools, in Galaxy. In addition we implemented some of the most popular command-line annotation and visualisation tools to allow research scientists to select candidate pathological mutations (SNV, and indels). Furthermore, we have developed a cloud-based public Galaxy instance to host the CGtag toolkit and other associated modules.ConclusionsCGtag provides a user-friendly interface to all research scientists wishing to select candidate variants from CG or other next-generation sequencing platforms’ data. By using a cloud-based infrastructure, we can also assure sufficient and on-demand computation and storage resources to handle the analysis tasks. The tools are freely available for use from an NBIC/CTMM-TraIT (The Netherlands Bioinformatics Center/Center for Translational Molecular Medicine) cloud-based Galaxy instance, or can be installed to a local (production) Galaxy via the NBIC Galaxy tool shed.
BMC Immunology | 2014
Michael Moorhouse; David van Zessen; Hanna IJspeert; Saskia Hiltemann; Sebastian Horsman; Peter J. van der Spek; Mirjam van der Burg; Andrew Stubbs
BackgroundSequence analysis of immunoglobulin heavy chain (IGH) gene rearrangements and frequency analysis is a powerful tool for studying the immune repertoire, immune responses and immune dysregulation in health and disease. The challenge is to provide user friendly, secure and reproducible analytical services that are available for both small and large laboratories which are determining VDJ repertoire using NGS technology.ResultsIn this study we describe ImmunoGlobulin Galaxy (IGGalaxy)- a convenient web based application for analyzing next-generation sequencing results and reporting IGH gene rearrangements for both repertoire and clonality studies. IGGalaxy has two analysis options one using the built in igBLAST algorithm and the second using output from IMGT; in either case repertoire summaries for the B-cell populations tested are available. IGGalaxy supports multi-sample and multi-replicate input analysis for both igBLAST and IMGT/HIGHV-QUEST. We demonstrate the technical validity of this platform using a standard dataset, S22, used for benchmarking the performance of antibody alignment utilities with a 99.9 % concordance with previous results. Re-analysis of NGS data from our samples of RAG-deficient patients demonstrated the validity and user friendliness of this tool.ConclusionsIGGalaxy provides clinical researchers with detailed insight into the repertoire of the B-cell population per individual sequenced and between control and pathogenic genomes. IGGalaxy was developed for 454 NGS results but is capable of analyzing alternative NGS data (e.g. Illumina, Ion Torrent). We demonstrate the use of a Galaxy virtual machine to determine the VDJ repertoire for reference data and from B-cells taken from immune deficient patients. IGGalaxy is available as a VM for download and use on a desktop PC or on a server.
Genome Research | 2015
Saskia Hiltemann; Guido Jenster; Jan Trapman; Peter J. van der Spek; Andrew Stubbs
Tumor analyses commonly employ a correction with a matched normal (MN), a sample from healthy tissue of the same individual, in order to distinguish germline mutations from somatic mutations. Since the majority of variants found in an individual are thought to be common within the population, we constructed a set of 931 samples from healthy, unrelated individuals, originating from two different sequencing platforms, to serve as a virtual normal (VN) in the absence of such an associated normal sample. Our approach removed (1) >96% of the germline variants also removed by the MN sample and (2) a large number (2%-8%) of additional variants not corrected for by the associated normal. The combination of the VN with the MN improved the correction for polymorphisms significantly, with up to ∼30% compared with MN and ∼15% compared with VN only. We determined the number of unrelated genomes needed in order to correct at least as efficiently as the MN is about 200 for structural variations (SVs) and about 400 for single-nucleotide variants (SNVs) and indels. In addition, we propose that the removal of common variants with purely position-based methods is inaccurate and incurs additional false-positive somatic variants, and more sophisticated algorithms, which are capable of leveraging information about the area surrounding variants, are needed for optimal accuracy. Our VN correction method can be used to analyze any list of variants, regardless of sequencing platform of origin. This VN methodology is available for use on our public Galaxy server.
Journal of Clinical Bioinformatics | 2012
Andrew Stubbs; Elizabeth A. McClellan; Sebastiaan Horsman; Saskia Hiltemann; Ivo Palli; Stephan Nouwens; Anton H. J. Koning; Frits Hoogland; Joke Reumers; Daphne Heijsman; Sigrid M.A. Swagemakers; Andreas Kremer; J Meijerink; Diether Lambrechts; Peter J. van der Spek
BackgroundNext generation sequencing provides clinical research scientists with direct read out of innumerable variants, including personal, pathological and common benign variants. The aim of resequencing studies is to determine the candidate pathogenic variants from individual genomes, or from family-based or tumor/normal genome comparisons. Whilst the use of appropriate controls within the experimental design will minimize the number of false positive variations selected, this number can be reduced further with the use of high quality whole genome reference data to minimize false positives variants prior to candidate gene selection. In addition the use of platform related sequencing error models can help in the recovery of ambiguous genotypes from lower coverage data.DescriptionWe have developed a whole genome database of human genetic variations, Huvariome, determined by whole genome deep sequencing data with high coverage and low error rates. The database was designed to be sequencing technology independent but is currently populated with 165 individual whole genomes consisting of small pedigrees and matched tumor/normal samples sequenced with the Complete Genomics sequencing platform. Common variants have been determined for a Benelux population cohort and represented as genotypes alongside the results of two sets of control data (73 of the 165 genomes), Huvariome Core which comprises 31 healthy individuals from the Benelux region, and Diversity Panel consisting of 46 healthy individuals representing 10 different populations and 21 samples in three Pedigrees. Users can query the database by gene or position via a web interface and the results are displayed as the frequency of the variations as detected in the datasets. We demonstrate that Huvariome can provide accurate reference allele frequencies to disambiguate sequencing inconsistencies produced in resequencing experiments. Huvariome has been used to support the selection of candidate cardiomyopathy related genes which have a homozygous genotype in the reference cohorts. This database allows the users to see which selected variants are common variants (> 5% minor allele frequency) in the Huvariome core samples, thus aiding in the selection of potentially pathogenic variants by filtering out common variants that are not listed in one of the other public genomic variation databases. The no-call rate and the accuracy of allele calling in Huvariome provides the user with the possibility of identifying platform dependent errors associated with specific regions of the human genome.ConclusionHuvariome is a simple to use resource for validation of resequencing results obtained by NGS experiments. The high sequence coverage and low error rates provide scientists with the ability to remove false positive results from pedigree studies. Results are returned via a web interface that displays location-based genetic variation frequency, impact on protein function, association with known genetic variations and a quality score of the variation base derived from Huvariome Core and the Diversity Panel data. These results may be used to identify and prioritize rare variants that, for example, might be disease relevant. In testing the accuracy of the Huvariome database, alleles of a selection of ambiguously called coding single nucleotide variants were successfully predicted in all cases. Data protection of individuals is ensured by restricted access to patient derived genomes from the host institution which is relevant for future molecular diagnostics.
Oncogene | 2015
I Teles Alves; T Hartjes; Elizabeth A. McClellan; Saskia Hiltemann; René Böttcher; N Dits; M R Temanni; Bart J. Janssen; W van Workum; P.J. van der Spek; Andrew Stubbs; A. de Klein; Bert H.J. Eussen; Jan Trapman; Guido Jenster
Gene fusions, mainly between TMPRSS2 and ERG, are frequent early genomic rearrangements in prostate cancer (PCa). In order to discover novel genomic fusion events, we applied whole-genome paired-end sequencing to identify structural alterations present in a primary PCa patient (G089) and in a PCa cell line (PC346C). Overall, we identified over 3800 genomic rearrangements in each of the two samples as compared with the reference genome. Correcting these structural variations for polymorphisms using whole-genome sequences of 46 normal samples, the numbers of cancer-related rearrangements were 674 and 387 for G089 and PC346C, respectively. From these, 192 in G089 and 106 in PC346C affected gene structures. Exclusion of small intronic deletions left 33 intergenic breaks in G089 and 14 in PC346C. Out of these, 12 and 9 reassembled genes with the same orientation, capable of generating a feasible fusion transcript. Using PCR we validated all the reliable predicted gene fusions. Two gene fusions were in-frame: MPP5–FAM71D in PC346C and ARHGEF3–C8ORF38 in G089. Downregulation of FAM71D and MPP5–FAM71D transcripts in PC346C cells decreased proliferation; however, no effect was observed in the RWPE-1-immortalized normal prostate epithelial cells. Together, our data showed that gene rearrangements frequently occur in PCa genomes but result in a limited number of fusion transcripts. Most of these fusion transcripts do not encode in-frame fusion proteins. The unique in-frame MPP5–FAM71D fusion product is important for proliferation of PC346C cells.
Human Genetics | 2013
Inês Teles Alves; Saskia Hiltemann; Thomas Hartjes; Peter J. van der Spek; Andrew Stubbs; Jan Trapman; Guido Jenster
The VCaP cell line is widely used in prostate cancer research as it is a unique model to study castrate resistant disease expressing high levels of the wild type androgen receptor and the TMPRSS2-ERG fusion transcript. Using next generation sequencing, we assembled the structural variations in VCaP genomic DNA and observed a massive number of genomic rearrangements along the q arm of chromosome 5, characteristic of chromothripsis. Chromothripsis is a recently recognized phenomenon characterized by extensive chromosomal shattering in a single catastrophic event, mainly detected in cancer cells. Various structural events identified on chromosome 5q of VCaP resulted in gene fusions. Out of the 18 gene fusion candidates tested, 15 were confirmed on genomic level. In our set of gene fusions, only rarely we observe microhomology flanking the breakpoints. On RNA level, only five transcripts were detected and NDUFAF2-MAST4 was the only resulting in an in-frame fusion transcript. Our data indicate that although a marker of genomic instability, chromothripsis might lead to only a limited number of functionally relevant fusion genes.
Bioinformatics | 2016
Youri Hoogstrate; René Böttcher; Saskia Hiltemann; Peter J. van der Spek; Guido Jenster; Andrew Stubbs
UNLABELLED A new generation of tools that identify fusion genes in RNA-seq data is limited in either sensitivity and or specificity. To allow further downstream analysis and to estimate performance, predicted fusion genes from different tools have to be compared. However, the transcriptomic context complicates genomic location-based matching. FusionMatcher (FuMa) is a program that reports identical fusion genes based on gene-name annotations. FuMa automatically compares and summarizes all combinations of two or more datasets in a single run, without additional programming necessary. FuMa uses one gene annotation, avoiding mismatches caused by tool-specific gene annotations. FuMa matches 10% more fusion genes compared with exact gene matching due to overlapping genes and accepts intermediate output files that allow a stepwise analysis of corresponding tools. AVAILABILITY AND IMPLEMENTATION The code is available at: https://github.com/ErasmusMC-Bioinformatics/fuma and available for Galaxy in the tool sheds and directly accessible at https://bioinf-galaxian.erasmusmc.nl/galaxy/ CONTACT [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Bioinformatics | 2013
Saskia Hiltemann; Elizabeth A. McClellan; Jos van Nijnatten; Sebastiaan Horsman; Ivo Palli; Inês Teles Alves; Thomas Hartjes; Jan Trapman; Peter J. van der Spek; Guido Jenster; Andrew Stubbs
UNLABELLED We present iFUSE (integrated fusion gene explorer), an online visualization tool that provides a fast and informative view of structural variation data and prioritizes those breaks likely representing fusion genes. This application uses calculated break points to determine fusion genes based on the latest annotation for genomic sequence information, and where relevant the structural variation (SV) events are annotated with predicted RNA and protein sequences. iFUSE takes as input a Complete Genomics (CG) junction file, a FusionMap fusion detection report file or a file already analysed and annotated by the iFUSE application on a previous occasion. RESULTS We demonstrate the use of iFUSE with case studies from tumour-normal SV detection derived from Complete Genomics whole-genome sequencing results. AVAILABILITY iFUSE is available as a web service at http://ifuse.erasmusmc.nl.
GigaScience | 2014
Saskia Hiltemann; Youri Hoogstrate; Peter J. van der Spek; Guido Jenster; Andrew Stubbs
BackgroundGalaxy offers a number of visualisation options with components, such as Trackster, Circster and Galaxy Charts, but currently lacks the ability to easily combine outputs from different tools into a single view or report. A number of tools produce HTML reports as output in order to combine the various output files from a single tool; however, this requires programming and knowledge of HTML, and the reports must be custom-made for each new tool.FindingsWe have developed a generic and flexible reporting tool for Galaxy, iReport, that allows users to create interactive HTML reports directly from the Galaxy UI, with the ability to combine an arbitrary number of outputs from any number of different tools. Content can be organised into different tabs, and interactivity can be added to components. To demonstrate the capability of iReport we provide two publically available examples, the first is an iReport explaining about iReports, created for, and using content from the recent Galaxy Community Conference 2014. The second is a genetic report based on a trio analysis to determine candidate pathogenic variants which uses our previously developed Galaxy toolset for whole-genome NGS analysis, CGtag. These reports may be adapted for outputs from any sequencing platform and any results, such as omics data, non-high throughput results and clinical variables.ConclusionsiReport provides a secure, collaborative, and flexible web-based reporting system that is compatible with Galaxy (and non-Galaxy) generated content. We demonstrate its value with a real-life example of reporting genetic trio-analysis.
European Journal of Clinical Microbiology & Infectious Diseases | 2018
Stefan A. Boers; Saskia Hiltemann; Andrew Stubbs; Ruud Jansen; John P. Hays
Microbiota profiling has the potential to greatly impact on routine clinical diagnostics by detecting DNA derived from live, fastidious, and dead bacterial cells present within clinical samples. Such results could potentially be used to benefit patients by influencing antibiotic prescribing practices or to generate new classical-based diagnostic methods, e.g., culture or PCR. However, technical flaws in 16S rRNA gene next-generation sequencing (NGS) protocols, together with the requirement for access to bioinformatics, currently hinder the introduction of microbiota analysis into clinical diagnostics. Here, we report on the development and evaluation of an “end-to-end” microbiota profiling platform (MYcrobiota), which combines our previously validated micelle PCR/NGS (micPCR/NGS) methodology with an easy-to-use, dedicated bioinformatics pipeline. The newly designed bioinformatics pipeline processes micPCR/NGS data automatically and summarizes the results in interactive, but simple web reports. In order to explore the utility of MYcrobiota in clinical diagnostics, 47 clinical samples (40 “damaged skin” samples and 7 synovial fluids) were investigated using routine bacterial culture as comparator. MYcrobiota confirmed the presence of bacterial DNA in 37/37 culture-positive samples and detected bacterial taxa in 2/10 culture-negative samples. Moreover, 36/38 potentially relevant aerobic bacterial taxa and 3/3 mixtures of anaerobic bacteria were identified using culture and MYcrobiota, with the sensitivity and specificity being 95%. Interestingly, the majority of the 448 bacterial taxa identified using MYcrobiota were not identified using culture, which could potentially have an impact on clinical decision-making. Taken together, the development of MYcrobiota is a promising step towards the introduction of microbiota analysis into clinical diagnostic laboratories.