Gilles Louppe | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gilles Louppe is active.

Explore More

Publication

Featured researches published by Gilles Louppe.

Bioinformatics | 2016

Collaborative analysis of multi-gigapixel imaging data using Cytomine

Loïc Rollus; Benjamin Stévens; Renaud Hoyoux; Gilles Louppe; Rémy Vandaele; Jean-Michel Begon; Philipp Kainz; Pierre Geurts; Louis Wehenkel

Motivation: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries. Results: We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explore, share and analyze (semantically and quantitatively) multi-gigapixel imaging data over the internet. We illustrate how it has been used in several biomedical applications. Availability and implementation: Cytomine (http://www.cytomine.be/) is freely available under an open-source license from http://github.com/cytomine/. A documentation wiki (http://doc.cytomine.be) and a demo server (http://demo.cytomine.be) are also available. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

PLOS ONE | 2014

Exploiting SNP Correlations within Random Forest for Genome-Wide Association Studies

Vincent Botta; Gilles Louppe; Pierre Geurts; Louis Wehenkel

The primary goal of genome-wide association studies (GWAS) is to discover variants that could lead, in isolation or in combination, to a particular trait or disease. Standard approaches to GWAS, however, are usually based on univariate hypothesis tests and therefore can account neither for correlations due to linkage disequilibrium nor for combinations of several markers. To discover and leverage such potential multivariate interactions, we propose in this work an extension of the Random Forest algorithm tailored for structured GWAS data. In terms of risk prediction, we show empirically on several GWAS datasets that the proposed T-Trees method significantly outperforms both the original Random Forest algorithm and standard linear models, thereby suggesting the actual existence of multivariate non-linear effects due to the combinations of several SNPs. We also demonstrate that variable importances as derived from our method can help identify relevant loci. Finally, we highlight the strong impact that quality control procedures may have, both in terms of predictive power and loci identification. Variable importance results and T-Trees source code are all available at www.montefiore.ulg.ac.be/~botta/ttrees/ and github.com/0asa/TTree-source respectively.

european conference on machine learning | 2012

Ensembles on random patches

Gilles Louppe; Pierre Geurts

In this paper, we consider supervised learning under the assumption that the available memory is small compared to the dataset size. This general framework is relevant in the context of big data, distributed databases and embedded systems. We investigate a very simple, yet effective, ensemble framework that builds each individual model of the ensemble from a random patch of data obtained by drawing random subsets of both instances and features from the whole dataset. We carry out an extensive and systematic evaluation of this method on 29 datasets, using decision tree-based estimators. With respect to popular ensemble methods, these experiments show that the proposed method provides on par performance in terms of accuracy while simultaneously lowering the memory needs, and attains significantly better performance when memory is severely constrained.

Bulletin of the American Meteorological Society | 2015

Solar Energy Prediction: An International Contest to Initiate Interdisciplinary Research on Compelling Meteorological Problems

Amy McGovern; David John Gagne; Lucas Eustaquio; Gilberto Titericz; Benjamin Lazorthes; Owen Zhang; Gilles Louppe; Peter Prettenhofer; Jeffrey B. Basara; Thomas M. Hamill; David Margolin

15 As meteorological observing systems and models grow in complexity and number, the size of 16 the data becomes overwhelming for humans to analyze using traditional techniques. Com17 puter scientists, and specifically machine learning and data mining researchers, are develop18 ing frameworks for analyzing big data. The AMS Committee on Artificial Intelligence and 19 its Applications to Environmental Science aims to bring AI researchers and environmental 20 scientists together to increase the synergy between the two. The AI committee has spon21 sored 4 previous contests on a variety of meteorological problems including wind energy, 22 storm classification, winter hydrometeor classification, and air pollution, with the goal of 23 bringing together the two fields of research. Although these were successful, the audience 24 was limited to existing environmental science researchers (usually 10-20 teams of people 25 primarily within the AMS community). For the 2013/14 contest, we expanded to a global 26 audience by focusing on the compelling problem of solar energy prediction and by having 27 the established forum Kaggle host our contest. Using this forum, we had over 160 teams 28 from all around the world participate. Improved solar energy forecasting is a necessary com29 ponent of making solar energy a viable alternative power source. This paper summarizes 30 our experiences in the 2013/14 contest, discusses the data in detail, and presents the win31 ning prediction methods. The contest data come from the NOAA/ESRL Global Ensemble 32 Forecasting System Reforecast Version 2 and the Oklahoma Mesonet with sponsorship from 33 EarthRisk Technologies. All winning methods utilized gradient boosted regression trees but 34 differed in parameter choices and interpolation methods. 35

GetMobile: Mobile Computing and Communications | 2015

Scikit-learn: Machine Learning Without Learning the Machinery

Gaël Varoquaux; Lars Buitinck; Gilles Louppe; Olivier Grisel; Fabian Pedregosa; Andreas Mueller

Machine learning is a pervasive development at the intersection of statistics and computer science. While it can benefit many data-related applications, the technical nature of the research literature and the corresponding algorithms slows down its adoption. Scikit-learn is an open-source software project that aims at making machine learning accessible to all, whether it be in academia or in industry. It benefits from the general-purpose Python language, which is both broadly adopted in the scientific world, and supported by a thriving ecosystem of contributors. Here we give a quick introduction to scikit-learn as well as to machine-learning basics.

international symposium on biomedical imaging | 2014

A hybrid human-computer approach for large-scale image-based measurements using web services and machine learning

Loı̈c Rollus; Benjamin Stévens; Gilles Louppe; Olivier Caubo; Natacha Rocks; Sandrine Bekaert; Didier Cataldo; Louis Wehenkel

We present a novel methodology combining Web-based software development practices, machine learning, and spatial databases for computer-aided quantification of regions of interest (ROIs) in large-scale imaging data. We describe our main methodological choices, and then illustrate the benefits of the approach (workload reduction, improved precision, scalability, and traceability) on hundreds of whole-slide images of biological tissue slices in cancer research.

arXiv: Machine Learning | 2014

Simple connectome inference from partial correlation statistics in calcium imaging

Antonio Sutera; Arnaud Joly; Vincent François-Lavet; Zixiao Aaron Qiu; Gilles Louppe; Damien Ernst; Pierre Geurts

This book illustrates the thrust of the scientific community to use machine learning concepts for tackling a complex problem: given time series of neuronal spontaneous activity, which is the underlying connectivity between the neurons in the network? The contributing authors also develop tools for the advancement of neuroscience through machine learning techniques, with a focus on the major open problems in neuroscience. While the techniques have been developed for a specific application, they address the more general problem of network reconstruction from observational time series, a problem of interest in a wide variety of domains, including econometrics, epidemiology, and climatology, to cite only a few. The book is designed for the mathematics, physics and computer science communities that carry out research in neuroscience problems. The content is also suitable for the machine learning community because it exemplifies how to approach the same problem from different perspectives.

european conference on machine learning | 2013

API design for machine learning software: experiences from the scikit-learn project

Lars Buitinck; Gilles Louppe; Mathieu Blondel; Fabian Pedregosa; Andreas Mueller; Olivier Grisel; Vlad Niculae; Peter Prettenhofer; Alexandre Gramfort; Jaques Grobler; Robert Layton; Jake Vanderplas; Arnaud Joly; Brian Holt; Gaël Varoquaux

neural information processing systems | 2013