Stijn Meganck | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stijn Meganck is active.

Explore More

Publication

Featured researches published by Stijn Meganck.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012

A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

Cosmin Lazar; Jonatan Taminau; Stijn Meganck; D. Steenhoff; Alain Coletta; Colin Molter; V. de Schaetzen; Robin Duque; Hugues Bersini; Ann Nowé

A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filters, wrappers, and embedded methods. More recently, a new group of methods has been added in the general framework of FS: ensemble techniques. The focus in this survey is on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery. We present them in a unified framework, using standardized notations in order to reveal their technical details and to highlight their common characteristics as well as their particularities.

modeling decisions for artificial intelligence | 2006

Learning causal bayesian networks from observations and experiments: a decision theoretic approach

Stijn Meganck; Philippe Leray; Bernard Manderick

We discuss a decision theoretic approach to learn causal Bayesian networks from observational data and experiments. We use the information of observational data to learn a completed partially directed acyclic graph using a structure learning technique and try to discover the directions of the remaining edges by means of experiment. We will show that our approach allows to learn a causal Bayesian network optimally with relation to a number of decision criteria. Our method allows the possibility to assign costs to each experiment and each measurement. We introduce an algorithm that allows to actively add results of experiments so that arcs can be directed during learning. A numerical example is given as demonstration of the techniques.

BMC Bioinformatics | 2012

Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages

Jonatan Taminau; Stijn Meganck; Cosmin Lazar; David Steenhoff; Alain Coletta; Colin Molter; Robin Duque; Virginie de Schaetzen; David Weiss Solís; Hugues Bersini; Ann Nowé

BackgroundWith an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck.ResultsWe present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well.ConclusionsBy providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/].

Genome Biology | 2012

InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor

Alain Coletta; Colin Molter; Robin Duque; David Steenhoff; Jonatan Taminau; Virginie de Schaetzen; Stijn Meganck; Cosmin Lazar; David Venet; Vincent Detours; Ann Nowé; Hugues Bersini; David Weiss Solís

Genomics datasets are increasingly useful for gaining biomedical insights, with adoption in the clinic underway. However, multiple hurdles related to data management stand in the way of their efficient large-scale utilization. The solution proposed is a web-based data storage hub. Having clear focus, flexibility and adaptability, InSilico DB seamlessly connects genomics dataset repositories to state-of-the-art and free GUI and command-line data analysis tools. The InSilico DB platform is a powerful collaborative environment, with advanced capabilities for biocuration, dataset sharing, and dataset subsetting and combination. InSilico DB is available from https://insilicodb.org.

Bioinformatics | 2011

inSilicoDb: an R/Bioconductor package for accessing human Affymetrix expert-curated datasets from GEO

Jonatan Taminau; David Steenhoff; Alain Coletta; Stijn Meganck; Cosmin Lazar; Virginie de Schaetzen; Robin Duque; Colin Molter; Hugues Bersini; Ann Nowé; David Weiss Solís

Microarray technology has become an integral part of biomedical research and increasing amounts of datasets become available through public repositories. However, re-use of these datasets is severely hindered by unstructured, missing or incorrect biological samples information; as well as the wide variety of preprocessing methods in use. The inSilicoDb R/Bioconductor package is a command-line front-end to the InSilico DB, a web-based database currently containing 86 104 expert-curated human Affymetrix expression profiles compiled from 1937 GEO repository series. The use of this package builds on the Bioconductor projects focus on reproducibility by enabling a clear workflow in which not only analysis, but also the retrieval of verified data is supported.

International Scholarly Research Notices | 2014

Comparison of merging and meta-analysis as alternative approaches for integrative gene expression analysis.

Jonatan Taminau; Cosmin Lazar; Stijn Meganck; Ann Nowé

An increasing amount of microarray gene expression data sets is available through public repositories. Their huge potential in making new findings is yet to be unlocked by making them available for large-scale analysis. In order to do so it is essential that independent studies designed for similar biological problems can be integrated, so that new insights can be obtained. These insights would remain undiscovered when analyzing the individual data sets because it is well known that the small number of biological samples used per experiment is a bottleneck in genomic analysis. By increasing the number of samples the statistical power is increased and more general and reliable conclusions can be drawn. In this work, two different approaches for conducting large-scale analysis of microarray gene expression data—meta-analysis and data merging—are compared in the context of the identification of cancer-related biomarkers, by analyzing six independent lung cancer studies. Within this study, we investigate the hypothesis that analyzing large cohorts of samples resulting in merging independent data sets designed to study the same biological problem results in lower false discovery rates than analyzing the same data sets within a more conservative meta-analysis approach.

International Journal of Approximate Reasoning | 2012

Conservative independence-based causal structure learning in absence of adjacency faithfulness

Jan Lemeire; Stijn Meganck; Francesco Cartella; Tingting Liu

This paper presents an extension to the Conservative PC algorithm which is able to detect violations of adjacency faithfulness under causal sufficiency and triangle faithfulness. Violations can be characterized by pseudo-independent relations and equivalent edges, both generating a pattern of conditional independencies that cannot be modeled faithfully. Both cases lead to uncertainty about specific parts of the skeleton of the causal graph. These ambiguities are modeled by an f-pattern. We prove that our Adjacency Conservative PC algorithm is able to correctly learn the f-pattern. We argue that the solution also applies for the finite sample case if we accept that only strong edges can be identified. Experiments based on simulations and the ALARM benchmark model show that the rate of false edge removals is significantly reduced, at the expense of uncertainty on the skeleton and a higher sensitivity for accidental correlations.

Journal of Physics: Conference Series | 2012

Online adaptive learning of Left-Right Continuous HMM for bearings condition assessment

Francesco Cartella; Tingting Liu; Stijn Meganck; Jan Lemeire; Hichem Sahli

Standard Hidden Markov Models (HMMs) approaches used for condition assessment of bearings assume that all the possible system states are fixed and known a priori and that training data from all of the associated states are available. Moreover, the training procedure is performed offline, and only once at the beginning, with the available training set. These assumptions significantly impede component diagnosis applications when all of the possible states of the system are not known in advance or environmental factors or operative conditions change during the tools usage. The method introduced in this paper overcomes the above limitations and proposes an approach to detect unknown degradation modalities using a Left-Right Continuous HMM with a variable state space. The proposed HMM is combined with Change Point Detection algorithms to (i) estimate, from historical observations, the initial number of the models states, as well as to perform an initial guess of the parameters, and (ii) to adaptively recognize new states and, consequently, adjust the model parameters during monitoring. The approach has been tested using real monitoring data taken from the NASA benchmark repository. A comparative study with state of the art techniques shows improvements in terms of reduction of the training procedure iterations, and early detection of unknown states.

european conference on symbolic and quantitative approaches to reasoning and uncertainty | 2007

Causal Graphical Models with Latent Variables: Learning and Inference

Stijn Meganck; Philippe Leray; Bernard Manderick

Several paradigms exist for modeling causal graphical models for discrete variables that can handle latent variables without explicitly modeling them quantitatively. Applying them to a problem domain consists of different steps: structure learning, parameter learning and using them for probabilistic or causal inference. We discuss two well-known formalisms, namely semi-Markovian causal models and maximal ancestral graphs and indicate their strengths and limitations. Previously an algorithm has been constructed that by combining elements from both techniques allows to learn a semi-Markovian causal models from a mixture of observational and experimental data. The goal of this paper is to recapitulate the integral learning process from observational and experimental data and to demonstrate how different types of inference can be performed efficiently in the learned models. We will do this by proposing an alternative representation for semi-Markovian causal models.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013

GENESHIFT: A Nonparametric Approach for Integrating Microarray Gene Expression Data Based on the Inner Product as a Distance Measure between the Distributions of Genes

Cosmin Lazar; Jonatan Taminau; Stijn Meganck; David Steenhoff; Alain Coletta; David Weiss Solís; Colin Molter; Robin Duque; Hugues Bersini; Ann Nowé

The potential of microarray gene expression (MAGE) data is only partially explored due to the limited number of samples in individual studies. This limitation can be surmounted by merging or integrating data sets originating from independent MAGE experiments, which are designed to study the same biological problem. However, this process is hindered by batch effects that are study-dependent and result in random data distortion; therefore numerical transformations are needed to render the integration of different data sets accurate and meaningful. Our contribution in this paper is two-fold. First we propose GENESHIFT, a new nonparametric batch effect removal method based on two key elements from statistics: empirical density estimation and the inner product as a distance measure between two probability density functions; second we introduce a new validation index of batch effect removal methods based on the observation that samples from two independent studies drawn from a same population should exhibit similar probability density functions. We evaluated and compared the GENESHIFT method with four other state-of-the-art methods for batch effect removal: Batch-mean centering, empirical Bayes or COMBAT, distance-weighted discrimination, and cross-platform normalization. Several validation indices providing complementary information about the efficiency of batch effect removal methods have been employed in our validation framework. The results show that none of the methods clearly outperforms the others. More than that, most of the methods used for comparison perform very well with respect to some validation indices while performing very poor with respect to others. GENESHIFT exhibits robust performances and its average rank is the highest among the average ranks of all methods used for comparison.

Explore More