Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Catharina Olsen is active.

Publication


Featured researches published by Catharina Olsen.


Nucleic Acids Research | 2016

TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data

Antonio Colaprico; Tiago Chedraoui Silva; Catharina Olsen; Luciano Garofano; Claudia Cava; Davide Garolini; Thais S. Sabedot; Tathiane Maistro Malta; Stefano Maria Pagnotta; Isabella Castiglioni; Michele Ceccarelli; Gianluca Bontempi; Houtan Noushmehr

The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGAs research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.


Eurasip Journal on Bioinformatics and Systems Biology | 2009

On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information

Catharina Olsen; Patrick E. Meyer; Gianluca Bontempi

The reverse engineering of transcription regulatory networks from expression data is gaining large interest in the bioinformatics community. An important family of inference techniques is represented by algorithms based on information theoretic measures which rely on the computation of pairwise mutual information. This paper aims to study the impact of the entropy estimator on the quality of the inferred networks. This is done by means of a comprehensive study which takes into consideration three state-of-the-art mutual information algorithms: ARACNE, CLR, and MRNET. Two different setups are considered in this work. The first one considers a set of 12 synthetically generated datasets to compare 8 different entropy estimators and three network inference algorithms. The two methods emerging as the most accurate ones from the first set of experiments are the MRNET method combined with the newly applied Spearman correlation and the CLR method combined with the Pearson correlation. The validation of these two techniques is then carried out on a set of 10 public domain microarray datasets measuring the transcriptional regulatory activity in the yeast organism.


Bioinformatics | 2013

mRMRe: an R package for parallelized mRMR ensemble feature selection

Nicolas De Jay; Simon Papillon-Cavanagh; Catharina Olsen; Nehme El-Hachem; Gianluca Bontempi; Benjamin Haibe-Kains

MOTIVATION Feature selection is one of the main challenges in analyzing high-throughput genomic data. Minimum redundancy maximum relevance (mRMR) is a particularly fast feature selection method for finding a set of both relevant and complementary features. Here we describe the mRMRe R package, in which the mRMR technique is extended by using an ensemble approach to better explore the feature space and build more robust predictors. To deal with the computational complexity of the ensemble approach, the main functions of the package are implemented and parallelized in C using the openMP Application Programming Interface. RESULTS Our ensemble mRMR implementations outperform the classical mRMR approach in terms of prediction accuracy. They identify genes more relevant to the biological context and may lead to richer biological interpretations. The parallelized functions included in the package show significant gains in terms of run-time speed when compared with previously released packages. AVAILABILITY The R package mRMRe is available on Comprehensive R Archive Network and is provided open source under the Artistic-2.0 License. The code used to generate all the results reported in this application note is available from Supplementary File 1. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Nucleic Acids Research | 2012

Predictive networks: a flexible, open source, web application for integration and analysis of human gene networks

Benjamin Haibe-Kains; Catharina Olsen; Amira Djebbari; Gianluca Bontempi; Mick Correll; Christopher Bouton; John Quackenbush

Genomics provided us with an unprecedented quantity of data on the genes that are activated or repressed in a wide range of phenotypes. We have increasingly come to recognize that defining the networks and pathways underlying these phenotypes requires both the integration of multiple data types and the development of advanced computational methods to infer relationships between the genes and to estimate the predictive power of the networks through which they interact. To address these issues we have developed Predictive Networks (PN), a flexible, open-source, web-based application and data services framework that enables the integration, navigation, visualization and analysis of gene interaction networks. The primary goal of PN is to allow biomedical researchers to evaluate experimentally derived gene lists in the context of large-scale gene interaction networks. The PN analytical pipeline involves two key steps. The first is the collection of a comprehensive set of known gene interactions derived from a variety of publicly available sources. The second is to use these ‘known’ interactions together with gene expression data to infer robust gene networks. The PN web application is accessible from http://predictivenetworks.org. The PN code base is freely available at https://sourceforge.net/projects/predictivenets/.


Bioinformatics | 2016

PharmacoGx: an R package for analysis of large pharmacogenomic datasets

Petr Smirnov; Zhaleh Safikhani; Nehme El-Hachem; Dong Wang; Adrian She; Catharina Olsen; Mark Freeman; Heather Selby; Deena M.A. Gendoo; Patrick Grossmann; Andrew H. Beck; Hugo J.W.L. Aerts; Mathieu Lupien; Anna Goldenberg; Benjamin Haibe-Kains

UNLABELLED Pharmacogenomics holds great promise for the development of biomarkers of drug response and the design of new therapeutic options, which are key challenges in precision medicine. However, such data are scattered and lack standards for efficient access and analysis, consequently preventing the realization of the full potential of pharmacogenomics. To address these issues, we implemented PharmacoGx, an easy-to-use, open source package for integrative analysis of multiple pharmacogenomic datasets. We demonstrate the utility of our package in comparing large drug sensitivity datasets, such as the Genomics of Drug Sensitivity in Cancer and the Cancer Cell Line Encyclopedia. Moreover, we show how to use our package to easily perform Connectivity Map analysis. With increasing availability of drug-related data, our package will open new avenues of research for meta-analysis of pharmacogenomic data. AVAILABILITY AND IMPLEMENTATION PharmacoGx is implemented in R and can be easily installed on any system. The package is available from CRAN and its source code is available from GitHub. CONTACT [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Genomics | 2014

Inference and validation of predictive gene networks from biomedical literature and gene expression data.

Catharina Olsen; Kathleen Fleming; Niall Prendergast; Renee Rubio; Frank Emmert-Streib; Gianluca Bontempi; Benjamin Haibe-Kains; John Quackenbush

Although many methods have been developed for inference of biological networks, the validation of the resulting models has largely remained an unsolved problem. Here we present a framework for quantitative assessment of inferred gene interaction networks using knock-down data from cell line experiments. Using this framework we are able to show that network inference based on integration of prior knowledge derived from the biomedical literature with genomic data significantly improves the quality of inferred networks relative to other approaches. Our results also suggest that cell line experiments can be used to quantitatively assess the quality of networks inferred from tumor samples.


Science Advances | 2016

Portraying breast cancers with long noncoding RNAs

Olivier Van Grembergen; Martin Bizet; Eric James de Bony; Emilie Calonne; Pascale Putmans; Sylvain Brohée; Catharina Olsen; Mingzhou Guo; Gianluca Bontempi; Christos Sotiriou; Matthieu Defrance; François Fuks

Comprehensive analysis of the lncRNA landscape in breast cancers and its relationship with key clinical features and functional pathways. Evidence is emerging that long noncoding RNAs (lncRNAs) may play a role in cancer development, but this role is not yet clear. We performed a genome-wide transcriptional survey to explore the lncRNA landscape across 995 breast tissue samples. We identified 215 lncRNAs whose genes are aberrantly expressed in breast tumors, as compared to normal samples. Unsupervised hierarchical clustering of breast tumors on the basis of their lncRNAs revealed four breast cancer subgroups that correlate tightly with PAM50-defined mRNA-based subtypes. Using multivariate analysis, we identified no less than 210 lncRNAs prognostic of clinical outcome. By analyzing the coexpression of lncRNA genes and protein-coding genes, we inferred potential functions of the 215 dysregulated lncRNAs. We then associated subtype-specific lncRNAs with key molecular processes involved in cancer. A correlation was observed, on the one hand, between luminal A–specific lncRNAs and the activation of phosphatidylinositol 3-kinase, fibroblast growth factor, and transforming growth factor–β pathways and, on the other hand, between basal-like–specific lncRNAs and the activation of epidermal growth factor receptor (EGFR)–dependent pathways and of the epithelial-to-mesenchymal transition. Finally, we showed that a specific lncRNA, which we called CYTOR, plays a role in breast cancer. We confirmed its predicted functions, showing that it regulates genes involved in the EGFR/mammalian target of rapamycin pathway and is required for cell proliferation, cell migration, and cytoskeleton organization. Overall, our work provides the most comprehensive analyses for lncRNA in breast cancers. Our findings suggest a wide range of biological functions associated with lncRNAs in breast cancer and provide a foundation for functional investigations that could lead to new therapeutic approaches.


BMC Bioinformatics | 2015

NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference

Pau Bellot; Catharina Olsen; Philippe Salembier; Albert Oliveras-Vergés; Patrick E. Meyer

BackgroundIn the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods.ResultsOur open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities.ConclusionsThe benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.


F1000Research | 2016

TCGA Workflow : Analyze cancer genomics and epigenomics data using Bioconductor packages

Tiago Chedraoui Silva; Antonio Colaprico; Catharina Olsen; Fulvio D'Angelo; Gianluca Bontempi; Michele Ceccarelli; Houtan Noushmehr

Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.


International Journal of Molecular Sciences | 2017

SpidermiR: An R/Bioconductor Package for Integrative Analysis with miRNA Data.

Claudia Cava; Antonio Colaprico; Gloria Bertoli; Alex Graudenzi; Tiago Chedraoui Silva; Catharina Olsen; Houtan Noushmehr; Gianluca Bontempi; Giancarlo Mauri; Isabella Castiglioni

Gene Regulatory Networks (GRNs) control many biological systems, but how such network coordination is shaped is still unknown. GRNs can be subdivided into basic connections that describe how the network members interact e.g., co-expression, physical interaction, co-localization, genetic influence, pathways, and shared protein domains. The important regulatory mechanisms of these networks involve miRNAs. We developed an R/Bioconductor package, namely SpidermiR, which offers an easy access to both GRNs and miRNAs to the end user, and integrates this information with differentially expressed genes obtained from The Cancer Genome Atlas. Specifically, SpidermiR allows the users to: (i) query and download GRNs and miRNAs from validated and predicted repositories; (ii) integrate miRNAs with GRNs in order to obtain miRNA–gene–gene and miRNA–protein–protein interactions, and to analyze miRNA GRNs in order to identify miRNA–gene communities; and (iii) graphically visualize the results of the analyses. These analyses can be performed through a single interface and without the need for any downloads. The full data sets are then rapidly integrated and processed locally.

Collaboration


Dive into the Catharina Olsen's collaboration.

Top Co-Authors

Avatar

Gianluca Bontempi

Université libre de Bruxelles

View shared research outputs
Top Co-Authors

Avatar

Benjamin Haibe-Kains

Princess Margaret Cancer Centre

View shared research outputs
Top Co-Authors

Avatar

Antonio Colaprico

Université libre de Bruxelles

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Claudia Cava

National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gloria Bertoli

National Research Council

View shared research outputs
Researchain Logo
Decentralizing Knowledge