Fabian Ojeda | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fabian Ojeda is active.

Explore More

Publication

Featured researches published by Fabian Ojeda.

Analytica Chimica Acta | 2010

A tutorial on support vector machine-based methods for classification problems in chemometrics

Jan Luts; Fabian Ojeda; Raf Van de Plas; Bart De Moor; Sabine Van Huffel; Johan A. K. Suykens

This tutorial provides a concise overview of support vector machines and different closely related techniques for pattern classification. The tutorial starts with the formulation of support vector machines for classification. The method of least squares support vector machines is explained. Approaches to retrieve a probabilistic interpretation are covered and it is explained how the binary classification techniques can be extended to multi-class methods. Kernel logistic regression, which is closely related to iteratively weighted least squares support vector machines, is discussed. Different practical aspects of these methods are addressed: the issue of feature selection, parameter tuning, unbalanced data sets, model evaluation and statistical comparison. The different concepts are illustrated on three real-life applications in the field of metabolomics, genetics and proteomics.

BMC Bioinformatics | 2010

Candidate gene prioritization by network analysis of differential expression using machine learning approaches

Daniela Nitsch; Joana P. Gonçalves; Fabian Ojeda; Bart De Moor; Yves Moreau

BackgroundDiscovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network.ResultsWe have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%.ConclusionIn this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.

Genome Medicine | 2009

A kernel-based integration of genome-wide data for clinical decision support.

Anneleen Daemen; Olivier Gevaert; Fabian Ojeda; Annelies Debucquoy; Johan A. K. Suykens; Christine Sempoux; Jean-Pascal Machiels; Karin Haustermans; Bart De Moor

BackgroundAlthough microarray technology allows the investigation of the transcriptomic make-up of a tumor in one experiment, the transcriptome does not completely reflect the underlying biology due to alternative splicing, post-translational modifications, as well as the influence of pathological conditions (for example, cancer) on transcription and translation. This increases the importance of fusing more than one source of genome-wide data, such as the genome, transcriptome, proteome, and epigenome. The current increase in the amount of available omics data emphasizes the need for a methodological integration framework.MethodsWe propose a kernel-based approach for clinical decision support in which many genome-wide data sources are combined. Integration occurs within the patient domain at the level of kernel matrices before building the classifier. As supervised classification algorithm, a weighted least squares support vector machine is used. We apply this framework to two cancer cases, namely, a rectal cancer data set containing microarray and proteomics data and a prostate cancer data set containing microarray and genomics data. For both cases, multiple outcomes are predicted.ResultsFor the rectal cancer outcomes, the highest leave-one-out (LOO) areas under the receiver operating characteristic curves (AUC) were obtained when combining microarray and proteomics data gathered during therapy and ranged from 0.927 to 0.987. For prostate cancer, all four outcomes had a better LOO AUC when combining microarray and genomics data, ranging from 0.786 for recurrence to 0.987 for metastasis.ConclusionsFor both cancer sites the prediction of all outcomes improved when more than one genome-wide data set was considered. This suggests that integrating multiple genome-wide data sources increases the predictive performance of clinical decision support models. This emphasizes the need for comprehensive multi-modal data. We acknowledge that, in a first phase, this will substantially increase costs; however, this is a necessary investment to ultimately obtain cost-efficient models usable in patient tailored therapy.

pacific symposium on biocomputing | 2006

Prospective exploration of biochemical tissue composition via imaging mass spectrometry guided by principal component analysis.

Raf Van de Plas; Fabian Ojeda; Maarten Dewil; Ludo Van Den Bosch; Bart De Moor; Etienne Waelkens

MALDI-based Imaging Mass Spectrometry (IMS) is an analytical technique that provides the opportunity to study the spatial distribution of biomolecules including proteins and peptides in organic tissue. IMS measures a large collection of mass spectra spread out over an organic tissue section and retains the absolute spatial location of these measurements for analysis and imaging. The classical approach to IMS imaging, producing univariate ion images, is not well suited as a first step in a prospective study where no a priori molecular target mass can be formulated. The main reasons for this are the size and the multivariate nature of IMS data. In this paper we describe the use of principal component analysis as a multivariate pre-analysis tool, to identify the major spatial and mass-related trends in the data and to guide further analysis downstream. First, a conceptual overview of principal component analysis for IMS is given. Then, we demonstrate the approach on an IMS data set collected from a transversal section of the spinal cord of a standard control rat.

Neural Networks | 2008

2008 Special Issue: Low rank updated LS-SVM classifiers for fast variable selection

Fabian Ojeda; Johan A. K. Suykens; Bart De Moor

Least squares support vector machine (LS-SVM) classifiers are a class of kernel methods whose solution follows from a set of linear equations. In this work we present low rank modifications to the LS-SVM classifiers that are useful for fast and efficient variable selection. The inclusion or removal of a candidate variable can be represented as a low rank modification to the kernel matrix (linear kernel) of the LS-SVM classifier. In this way, the LS-SVM solution can be updated rather than being recomputed, which improves the efficiency of the overall variable selection process. Relevant variables are selected according to a closed form of the leave-one-out (LOO) error estimator, which is obtained as a by-product of the low rank modifications. The proposed approach is applied to several benchmark data sets as well as two microarray data sets. When compared to other related algorithms used for variable selection, simulations applying our approach clearly show a lower computational complexity together with good stability on the generalization error.

Human Reproduction | 2012

Combined mRNA microarray and proteomic analysis of eutopic endometrium of women with and without endometriosis

Amelie Fassbender; N. Verbeeck; D. Börnigen; Cleophas Kyama; Attila Bokor; Alexandra Vodolazkaia; Karen Peeraer; Carla Tomassetti; Christel Meuleman; Olivier Gevaert; R Van de Plas; Fabian Ojeda; B. De Moor; Yves Moreau; Etienne Waelkens; Thomas D'Hooghe

BACKGROUND An early semi-invasive diagnosis of endometriosis has the potential to allow early treatment and minimize disease progression but no such test is available at present. Our aim was to perform a combined mRNA microarray and proteomic analysis on the same eutopic endometrium sample obtained from patients with and without endometriosis. METHODS mRNA and protein fractions were extracted from 49 endometrial biopsies obtained from women with laparoscopically proven presence (n= 31) or absence (n= 18) of endometriosis during the early luteal (n= 27) or menstrual phase (n= 22) and analyzed using microarray and proteomic surface enhanced laser desorption ionization-time of flight mass spectrometry, respectively. Proteomic data were analyzed using a least squares-support vector machines (LS-SVM) model built on 70% (training set) and 30% of the samples (test set). RESULTS mRNA analysis of eutopic endometrium did not show any differentially expressed genes in women with endometriosis when compared with controls, regardless of endometriosis stage or cycle phase. mRNA was differentially expressed (P< 0.05) in women with (925 genes) and without endometriosis (1087 genes) during the menstrual phase when compared with the early luteal phase. Proteomic analysis based on five peptide peaks [2072 mass/charge (m/z); 2973 m/z; 3623 m/z; 3680 m/z and 21133 m/z] using an LS-SVM model applied on the luteal phase endometrium training set allowed the diagnosis of endometriosis (sensitivity, 91; 95% confidence interval (CI): 74-98; specificity, 80; 95% CI: 66-97 and positive predictive value, 87.9%; negative predictive value, 84.8%) in the test set. CONCLUSION mRNA expression of eutopic endometrium was comparable in women with and without endometriosis but different in menstrual endometrium when compared with luteal endometrium in women with endometriosis. Proteomic analysis of luteal phase endometrium allowed the diagnosis of endometriosis with high sensitivity and specificity in training and test sets. A potential limitation of our study is the fact that our control group included women with a normal pelvis as well as women with concurrent pelvic disease (e.g. fibroids, benign ovarian cysts, hydrosalpinges), which may have contributed to the comparable mRNA expression profile in the eutopic endometrium of women with endometriosis and controls.

Obstetrics & Gynecology | 2012

Proteomics analysis of plasma for early diagnosis of endometriosis

Amelie Fassbender; Etienne Waelkens; Nico Verbeeck; Cleophas Kyama; Attila Bokor; Alexandra Vodolazkaia; Raf Van de Plas; Christel Meuleman; Karen Peeraer; Carla Tomassetti; Olivier Gevaert; Fabian Ojeda; Bart De Moor; Thomas D'Hooghe

OBJECTIVE: To test the hypothesis that differential surface-enhanced laser desorption/ionization time-of-flight mass spectrometry protein or peptide expression in plasma can be used in infertile women with or without pelvic pain to predict the presence of laparoscopically and histologically confirmed endometriosis, especially in the subpopulation with a normal preoperative gynecologic ultrasound examination. METHODS: Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry analysis was performed on 254 plasma samples obtained from 89 women without endometriosis and 165 women with endometriosis (histologically confirmed) undergoing laparoscopies for infertility with or without pelvic pain. Data were analyzed using least squares support vector machines and were divided randomly (100 times) into a training data set (70%) and a test data set (30%). RESULTS: Minimal-to-mild endometriosis was best predicted (sensitivity 75%, 95% confidence interval [CI] 63–89; specificity 86%, 95% CI 71–94; positive predictive value 83.6%, negative predictive value 78.3%) using a model based on five peptide and protein peaks (range 4.898–14.698 m/z) in menstrual phase samples. Moderate-to-severe endometriosis was best predicted (sensitivity 98%, 95% CI 84–100; specificity 81%, 95% CI 67–92; positive predictive value 74.4%, negative predictive value 98.6%) using a model based on five other peptide and protein peaks (range 2.189–7.457 m/z) in luteal phase samples. The peak with the highest intensity (2.189 m/z) was identified as a fibrinogen &bgr;-chain peptide. Ultrasonography-negative endometriosis was best predicted (sensitivity 88%, 95% CI 73–100; specificity 84%, 95% CI 71–96) using a model based on five peptide peaks (range 2.058–42.065 m/z) in menstrual phase samples. CONCLUSION: A noninvasive test using proteomic analysis of plasma samples obtained during the menstrual phase enabled the diagnosis of endometriosis undetectable by ultrasonography with high sensitivity and specificity. LEVEL OF EVIDENCE: II

pattern recognition in bioinformatics | 2010

Semi-supervised learning of sparse linear models in mass spectral imaging

Fabian Ojeda; Marco Signoretto; Raf Van de Plas; Etienne Waelkens; Bart De Moor; Johan A. K. Suykens

We present an approach to learn predictive models and perform variable selection by incorporating structural information from Mass Spectral Imaging (MSI) data. We explore the use of a smooth quadratic penalty to model the natural ordering of the physical variables, that is the mass-to-charge (m/z) ratios. Thereby, estimated model parameters for nearby variables are enforced to smoothly vary. Similarly, to overcome the lack of labeled data we model the spatial proximity among spectra by means of a connectivity graph over the set of predicted labels. We explore the usefulness of this approach in a mouse brain MSI data set.

international symposium on neural networks | 2007

Variable selection by rank-one updates for least squares support vector machines

Fabian Ojeda; Johan A. K. Suykens; B. De Moor

Least squares support vector machines (LS-SVM) classifiers are a class of simple, yet powerful, kernel methods whose solution follows from a set of linear equations. Here, forward and backward algorithms, based on this technique, are proposed for fast and efficient variable selection. By exploiting the structure of the LS-SVM solution a closed form expression for the leave-one-out (LOO) estimator, useful for selecting variables, is obtained. For inclusion or removal of a new variable, rank-one adjustments in the kernel matrix (linear kernel) allow for updating, rather than recomputing, the LS-SVM solution. The proposed approach is applied to microarray data for gene selection. Simulations clearly show lower computational complexity along with good stability on the generalization performance when compared to other related algorithms.

international symposium on neural networks | 2010

Polynomial componentwise LS-SVM: Fast variable selection using low rank updates

Fabian Ojeda; Tillmann Falck; Bart De Moor; Johan A. K. Suykens

This paper describes a Least Squares Support Vector Machines (LS-SVM) approach to estimate additive models as a sum of non-linear components. In particular, this work discuses the low rank matrix modifications for componentwise polynomial kernels, which allow the factors of the modified kernel-matrix to be directly updated. The main concept refers to the use of a valid explicit feature map for polynomial kernels in an additive setting. By exploiting the structure of such feature map the model parameters of the classification/regression problem can be easily modified and updated when new variables are added. Therefore, the low rank updates constitute an algorithmic tool to efficiently obtain the model parameters once the system has been altered in some minimal sense. Such strategy allows, for instance, the development of algorithms for sequential variable ranking in high dimensional settings, while non-linearity is provided by the polynomial feature map. Moreover relevant variables can be robustly ranked using the closed form of the leave-one-out (LOO) error estimator, obtained as a by-product of the low rank modifications.

Explore More