Pierre Geurts | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pierre Geurts is active.

Explore More

Publication

Featured researches published by Pierre Geurts.

Machine Learning | 2006

Extremely randomized trees

Pierre Geurts; Damien Ernst; Louis Wehenkel

This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.

computer vision and pattern recognition | 2005

Random subwindows for robust image classification

Pierre Geurts; Justus H. Piater; Louis Wehenkel

We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted subwindows that are suitably normalized to yield robustness to certain image transformations. Our method is evaluated on four very different, publicly available datasets (COIL-100, ZuBuD, ETH-80, WANG). Our results show that our automatic approach is generic and robust to illumination, scale, and viewpoint changes. An extension of the method is proposed to improve its robustness with respect to rotation changes.

european conference on principles of data mining and knowledge discovery | 2001

Pattern Extraction for Time Series Classification

Pierre Geurts

In this paper, we propose some new tools to allow machine learning classifiers to cope with time series data. We first argue that many time-series classification problems can be solved by detecting and combining local properties or patterns in time series. Then, a technique is proposed to find patterns which are useful for classification. These patterns are combined to build interpretable classification rules. Experiments, carried out on several artificial and real problems, highlight the interest of the approach both in terms of interpretability and accuracy of the induced classifiers.

Bioinformatics | 2005

Proteomic mass spectra classification using decision tree based ensemble methods

Pierre Geurts; Marianne Fillet; Dominique de Seny; Marie-Alice Meuwis; Michel Malaise; Marie-Paule Merville; Louis Wehenkel

MOTIVATION Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. RESULTS We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems.

PLOS ONE | 2011

MicroRNAs Profiling in Murine Models of Acute and Chronic Asthma: A Relationship with mRNAs Targets

Nancy Garbacki; Emmanuel Di Valentin; Vân Anh Huynh-Thu; Pierre Geurts; Alexandre Irrthum; Céline Crahay; Thierry Arnould; Christophe Deroanne; Jacques Piette; Didier Cataldo; Alain Colige

Background miRNAs are now recognized as key regulator elements in gene expression. Although they have been associated with a number of human diseases, their implication in acute and chronic asthma and their association with lung remodelling have never been thoroughly investigated. Methodology/Principal Findings In order to establish a miRNAs expression profile in lung tissue, mice were sensitized and challenged with ovalbumin mimicking acute, intermediate and chronic human asthma. Levels of lung miRNAs were profiled by microarray and in silico analyses were performed to identify potential mRNA targets and to point out signalling pathways and biological processes regulated by miRNA-dependent mechanisms. Fifty-eight, 66 and 75 miRNAs were found to be significantly modulated at short-, intermediate- and long-term challenge, respectively. Inverse correlation with the expression of potential mRNA targets identified mmu-miR-146b, -223, -29b, -29c, -483, -574-5p, -672 and -690 as the best candidates for an active implication in asthma pathogenesis. A functional validation assay was performed by cotransfecting in human lung fibroblasts (WI26) synthetic miRNAs and engineered expression constructs containing the coding sequence of luciferase upstream of the 3′UTR of various potential mRNA targets. The bioinformatics analysis identified miRNA-linked regulation of several signalling pathways, as matrix metalloproteinases, inflammatory response and TGF-β signalling, and biological processes, including apoptosis and inflammation. Conclusions/Significance This study highlights that specific miRNAs are likely to be involved in asthma disease and could represent a valuable resource both for biological makers identification and for unveiling mechanisms underlying the pathogenesis of asthma.

Molecular BioSystems | 2009

Supervised learning with decision tree-based methods in computational and systems biology

Pierre Geurts; Alexandre Irrthum; Louis Wehenkel

At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this class of methods. The first part of the review is devoted to an intuitive but complete description of decision tree-based methods and a discussion of their strengths and limitations with respect to other supervised learning methods. The second part of the review provides a survey of their applications in the context of computational and systems biology.

Clinical Biochemistry | 2008

Proteomics for prediction and characterization of response to infliximab in Crohn's disease: A pilot study

Marie-Alice Meuwis; Marianne Fillet; Laurence Lutteri; Pierre Geurts; Dominique de Seny; Michel Malaise; Jean-Paul Chapelle; Louis Wehenkel; Jacques Belaiche; Marie-Paule Merville; Edouard Louis

OBJECTIVES Infliximab is the first anti-TNFalpha accepted by the Food and Drug Administration for use in inflammatory bowel disease treatment. Few clinical, biological and genetic factors tend to predict response in Crohns disease (CD) patient subcategories, none widely predicting response to infliximab. DESIGN AND METHODS Twenty CD patients showing clinical response or non response to infliximab were used for serum proteomic profiling on Surface Enhanced Lazer Desorption Ionisation-Time of Flight-Mass Spectrometry (SELDI-TOF-MS), each before and after treatment. Univariate and multivariate data analysis were performed for prediction and characterization of response to infliximab. RESULTS We obtained a model of classification predicting response to treatment and selected relevant potential biomarkers, among which platelet aggregation factor 4 (PF4). We quantified PF4, sCD40L and IL-6 by ELISA for correlation studies. CONCLUSIONS This first proteomic pilot study on response to infliximab in CD suggests association between platelet metabolism and response to infliximab and requires validation studies on a larger cohort of patients.

Zebrafish | 2013

Automated Processing of Zebrafish Imaging Data: A Survey

Ralf Mikut; Thomas Dickmeis; Wolfgang Driever; Pierre Geurts; Fred A. Hamprecht; Bernhard X. Kausler; Maria J. Ledesma-Carbayo; Karol Mikula; Periklis Pantazis; Olaf Ronneberger; Andrés Santos; Rainer Stotzka; Uwe Strähle; Nadine Peyriéras

Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines.

international conference on networking | 2010

Network distance prediction based on decentralized matrix factorization

Yongjun Liao; Pierre Geurts; Guy Leduc

Network Coordinate Systems (NCS) are promising techniques to predict unknown network distances from a limited number of measurements. Most NCS algorithms are based on metric space embedding and suffer from the inability to represent distance asymmetries and Triangle Inequality Violations (TIVs). To overcome these drawbacks, we formulate the problem of network distance prediction as guessing the missing elements of a distance matrix and solve it by matrix factorization. A distinct feature of our approach, called Decentralized Matrix Factorization (DMF), is that it is fully decentralized. The factorization of the incomplete distance matrix is collaboratively and iteratively done at all nodes with each node retrieving only a small number of distance measurements. There are no special nodes such as landmarks nor a central node where the distance measurements are collected and stored. We compare DMF with two popular NCS algorithms: Vivaldi and IDES. The former is based on metric space embedding, while the latter is also based on matrix factorization but uses landmarks. Experimental results show that DMF achieves competitive accuracy with the double advantage of having no landmarks and of being able to represent distance asymmetries and TIVs.

international conference on machine learning | 2006

Kernelizing the output of tree-based methods

Pierre Geurts; Louis Wehenkel; Florence d'Alché-Buc

We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting algorithm, called output kernel trees (OK3), generalizes classification and regression trees as well as tree-based ensemble methods in a principled way. It inherits several features of these methods such as interpretability, robustness to irrelevant variables, and input scalability. When only the Gram matrix over the outputs of the learning sample is given, it learns the output kernel as a function of inputs. We show that the proposed algorithm works well on an image reconstruction task and on a biological network inference problem.

Explore More