Ewa Szczurek
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ewa Szczurek.
research in computational molecular biology | 2014
Ewa Szczurek; Niko Beerenwinkel
Recent years in cancer research were characterized by both accumulation of data and growing awareness of its overwhelming complexity. Consortia like The Cancer Genome Atlas [1] generated large collections of tumor samples, recording presence or absence of genomic alterations, such as somatic point mutations, amplifications, or deletions of genes. One of the basic tasks in the analysis of tumor genomic data is to elucidate sets of genes involved in a common oncogenic pathway. A de novo approach to this task is to search for mutually exclusive patterns in cancer genomic data [2, 3, 4, 5], where these alterations tend not to occur together in the same patient. Such patterns are commonly evaluated and ranked by their coverage and impurity. Coverage is defined as the number of patient samples in which at least one alteration occurred, while impurity refers to non-exclusive, additional alterations that violate strict mutual exclusivity. Mutually exclusive patterns have frequently been observed in cancer data, and were associated with functional pathways [6].
Molecular Systems Biology | 2009
Ewa Szczurek; Irit Gat-Viks; Jerzy Tiuryn; Martin Vingron
Signaling cascades are triggered by environmental stimulation and propagate the signal to regulate transcription. Systematic reconstruction of the underlying regulatory mechanisms requires pathway‐targeted, informative experimental data. However, practical experimental design approaches are still in their infancy. Here, we propose a framework that iterates design of experiments and identification of regulatory relationships downstream of a given pathway. The experimental design component, called MEED, aims to minimize the amount of laboratory effort required in this process. To avoid ambiguity in the identification of regulatory relationships, the choice of experiments maximizes diversity between expression profiles of genes regulated through different mechanisms. The framework takes advantage of expert knowledge about the pathways under study, formalized in a predictive logical model. By considering model‐predicted dependencies between experiments, MEED is able to suggest a whole set of experiments that can be carried out simultaneously. Our framework was applied to investigate interconnected signaling pathways in yeast. In comparison with other approaches, MEED suggested the most informative experiments for unambiguous identification of transcriptional regulation in this system.
Genome Research | 2013
Aleksander Jankowski; Ewa Szczurek; Ralf Jauch; Jerzy Tiuryn; Shyam Prabhakar
The binding of transcription factors (TFs) to their specific motifs in genomic regulatory regions is commonly studied in isolation. However, in order to elucidate the mechanisms of transcriptional regulation, it is essential to determine which TFs bind DNA cooperatively as dimers and to infer the precise nature of these interactions. So far, only a small number of such dimeric complexes are known. Here, we present an algorithm for predicting cell-type-specific TF-TF dimerization on DNA on a large scale, using DNase I hypersensitivity data from 78 human cell lines. We represented the universe of possible TF complexes by their corresponding motif complexes, and analyzed their occurrence at cell-type-specific DNase I hypersensitive sites. Based on ∼1.4 billion tests for motif complex enrichment, we predicted 603 highly significant cell-type-specific TF dimers, the vast majority of which are novel. Our predictions included 76% (19/25) of the known dimeric complexes and showed significant overlap with an experimental database of protein-protein interactions. They were also independently supported by evolutionary conservation, as well as quantitative variation in DNase I digestion patterns. Notably, the known and predicted TF dimers were almost always highly compact and rigidly spaced, suggesting that TFs dimerize in close proximity to their partners, which results in strict constraints on the structure of the DNA-bound complex. Overall, our results indicate that chromatin openness profiles are highly predictive of cell-type-specific TF-TF interactions. Moreover, cooperative TF dimerization seems to be a widespread phenomenon, with multiple TF complexes predicted in most cell types.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2009
Mikhail A. Roytberg; Anna Gambin; Laurent Noé; Sławomir Lasota; Eugenia Furletova; Ewa Szczurek; Gregory Kucherov
We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets. We then perform a comparative analysis of seeds built over those alphabets and compare them with the standard BLASTP seeding method [2], [3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seeds is less expressive (but less costly to implement) than the cumulative principle used in BLASTP and vector seeds, our seeds show a similar or even better performance than BLASTP on Bernoulli models of proteins compatible with the common BLOSUM62 matrix. Finally, we perform a large-scale benchmarking of our seeds against several main databases of protein alignments. Here again, the results show a comparable or better performance of our seeds versus BLASTP.
BMC Genomics | 2014
Pauli Rämö; Anna Drewek; Cécile Arrieumerlou; Niko Beerenwinkel; Houchaima Ben-Tekaya; Bettina Cardel; Alain Casanova; Raquel Conde-Álvarez; Pascale Cossart; Gabor Csucs; Simone Eicher; Mario Emmenlauer; Urs F. Greber; Wolf-Dietrich Hardt; Ari Helenius; Christoph Alexander Kasper; Andreas Kaufmann; Saskia Kreibich; Andreas Kühbacher; Peter Z. Kunszt; Shyan Huey Low; Jason Mercer; Daria Mudrak; Simone Muntwiler; Lucas Pelkmans; Javier Pizarro-Cerdá; Michael Podvinec; Eva Pujadas; Bernd Rinn; Vincent Rouilly
BackgroundLarge-scale RNAi screening has become an important technology for identifying genes involved in biological processes of interest. However, the quality of large-scale RNAi screening is often deteriorated by off-targets effects. In order to find statistically significant effector genes for pathogen entry, we systematically analyzed entry pathways in human host cells for eight pathogens using image-based kinome-wide siRNA screens with siRNAs from three vendors. We propose a Parallel Mixed Model (PMM) approach that simultaneously analyzes several non-identical screens performed with the same RNAi libraries.ResultsWe show that PMM gains statistical power for hit detection due to parallel screening. PMM allows incorporating siRNA weights that can be assigned according to available information on RNAi quality. Moreover, PMM is able to estimate a sharedness score that can be used to focus follow-up efforts on generic or specific gene regulators. By fitting a PMM model to our data, we found several novel hit genes for most of the pathogens studied.ConclusionsOur results show parallel RNAi screening can improve the results of individual screens. This is currently particularly interesting when large-scale parallel datasets are becoming more and more publicly available. Our comprehensive siRNA dataset provides a public, freely available resource for further statistical and biological analyses in the high-content, high-throughput siRNA screening field.
Genome Biology | 2015
Fabian Schmich; Ewa Szczurek; Saskia Kreibich; Sabrina Dilling; Daniel Andritschke; Alain Casanova; Shyan Huey Low; Simone Eicher; Simone Muntwiler; Mario Emmenlauer; Pauli Rämö; Raquel Conde-Álvarez; Christian von Mering; Wolf-Dietrich Hardt; Christoph Dehio; Niko Beerenwinkel
Small interfering RNAs (siRNAs) exhibit strong off-target effects, which confound the gene-level interpretation of RNA interference screens and thus limit their utility for functional genomics studies. Here, we present gespeR, a statistical model for reconstructing individual, gene-specific phenotypes. Using 115,878 siRNAs, single and pooled, from three companies in three pathogen infection screens, we demonstrate that deconvolution of image-based phenotypes substantially improves the reproducibility between independent siRNA sets targeting the same genes. Genes selected and prioritized by gespeR are validated and shown to constitute biologically relevant components of pathogen entry mechanisms and TGF-β signaling. gespeR is available as a Bioconductor R-package.
International Journal of Cancer | 2013
Ewa Szczurek; Navodit Misra; Martin Vingron
Synthetic lethal interactions in cancer hold the potential for successful combined therapies, which would avoid the difficulties of single molecule‐targeted treatment. Identification of interactions that are specific for human tumors is an open problem in cancer research. This work aims at deciphering synthetic sick or lethal interactions directly from somatic alteration, expression and survival data of cancer patients. To this end, we look for pairs of genes and their alterations or expression levels that are “avoided” by tumors and “beneficial” for patients. Thus, candidates for synthetic sickness or lethality (SSL) interaction are identified as such gene pairs whose combination of states is under‐represented in the data. Our main methodological contribution is a quantitative score that allows ranking of the candidate SSL interactions according to evidence found in patient survival. Applying this analysis to glioblastoma data, we collect 1,956 synthetic sick or lethal partners for 85 abundantly altered genes, most of which show extensive copy number variation across the patient cohort. We rediscover and interpret known interaction between TP53 and PLK1, as well as provide insight into the mechanism behind EGFR interacting with AKT2, but not AKT1 nor AKT3. Cox model analysis determines 274 of identified interactions as having significant impact on overall survival in glioblastoma, which is more informative than a standard survival predictor based on patients age.
Bioinformatics | 2016
Simona Constantinescu; Ewa Szczurek; Pejman Mohammadi; Jörg Rahnenführer; Niko Beerenwinkel
MOTIVATION Despite recent technological advances in genomic sciences, our understanding of cancer progression and its driving genetic alterations remains incomplete. RESULTS We introduce TiMEx, a generative probabilistic model for detecting patterns of various degrees of mutual exclusivity across genetic alterations, which can indicate pathways involved in cancer progression. TiMEx explicitly accounts for the temporal interplay between the waiting times to alterations and the observation time. In simulation studies, we show that our model outperforms previous methods for detecting mutual exclusivity. On large-scale biological datasets, TiMEx identifies gene groups with strong functional biological relevance, while also proposing new candidates for biological validation. TiMEx possesses several advantages over previous methods, including a novel generative probabilistic model of tumorigenesis, direct estimation of the probability of mutual exclusivity interaction, computational efficiency and high sensitivity in detecting gene groups involving low-frequency alterations. AVAILABILITY AND IMPLEMENTATION TiMEx is available as a Bioconductor R package at www.bsse.ethz.ch/cbg/software/TiMEx CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Bioinformatics | 2014
Navodit Misra; Ewa Szczurek; Martin Vingron
MOTIVATION Cancer cell genomes acquire several genetic alterations during somatic evolution from a normal cell type. The relative order in which these mutations accumulate and contribute to cell fitness is affected by epistatic interactions. Inferring their evolutionary history is challenging because of the large number of mutations acquired by cancer cells as well as the presence of unknown epistatic interactions. RESULTS We developed Bayesian Mutation Landscape (BML), a probabilistic approach for reconstructing ancestral genotypes from tumor samples for much larger sets of genes than previously feasible. BML infers the likely sequence of mutation accumulation for any set of genes that is recurrently mutated in tumor samples. When applied to tumor samples from colorectal, glioblastoma, lung and ovarian cancer patients, BML identifies the diverse evolutionary scenarios involved in tumor initiation and progression in greater detail, but broadly in agreement with prior results. AVAILABILITY AND IMPLEMENTATION Source code and all datasets are freely available at bml.molgen.mpg.de. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Journal of Computational Biology | 2010
Ewa Szczurek; Przemysław Biecek; Jerzy Tiuryn; Martin Vingron
Gene expression measurements allow determining sets of up- or down-regulated, or unchanged genes in a particular experimental condition. Additional biological knowledge can suggest examples of genes from one of these sets. For instance, known target genes of a transcriptional activator are expected, but are not certain to go down after this activator is knocked out. Available differential expression analysis tools do not take such imprecise examples into account. Here we put forward a novel partially supervised mixture modeling methodology for differential expression analysis. Our approach, guided by imprecise examples, clusters expression data into differentially expressed and unchanged genes. The partially supervised methodology is implemented by two methods: a newly introduced belief-based mixture modeling, and soft-label mixture modeling, a method proved efficient in other applications. We investigate on synthetic data the input example settings favorable for each method. In our tests, both belief-based and soft-label methods prove their advantage over semi-supervised mixture modeling in correcting for erroneous examples. We also compare them to alternative differential expression analysis approaches, showing that incorporation of knowledge yields better performance. We present a broad range of knowledge sources and data to which our partially supervised methodology can be applied. First, we determine targets of Ste12 based on yeast knockout data, guided by a Ste12 DNA-binding experiment. Second, we distinguish miR-1 from miR-124 targets in human by clustering expression data under transfection experiments of both microRNAs, using their computationally predicted targets as examples. Finally, we utilize literature knowledge to improve clustering of time-course expression profiles.