Nicolas Lachiche | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicolas Lachiche is active.

Explore More

Publication

Featured researches published by Nicolas Lachiche.

Machine Learning | 2001

Confirmation-guided discovery of first-order rules with tertius

Peter A. Flach; Nicolas Lachiche

This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach.

Machine Learning | 2004

Naive Bayesian Classification of Structured Data

Peter A. Flach; Nicolas Lachiche

In this paper we present 1BC and 1BC2, two systems that perform naive Bayesian classification of structured individuals. The approach of 1BC is to project the individuals along first-order features. These features are built from the individual using structural predicates referring to related objects (e.g., atoms within molecules), and properties applying to the individual or one or several of its related objects (e.g., a bond between two atoms). We describe an individual in terms of elementary features consisting of zero or more structural predicates and one property; these features are treated as conditionally independent in the spirit of the naive Bayes assumption. 1BC2 represents an alternative first-order upgrade to the naive Bayesian classifier by considering probability distributions over structured objects (e.g., a molecule as a set of atoms), and estimating those distributions from the probabilities of its elements (which are assumed to be independent). We present a unifying view on both systems in which 1BC works in language space, and 1BC2 works in individual space. We also present a new, efficient recursive algorithm improving upon the original propositionalisation approach of 1BC. Both systems have been implemented in the context of the first-order descriptive learner Tertius, and we investigate the differences between the two systems both in computational terms and on artificially generated data. Finally, we describe a range of experiments on ILP benchmark data sets demonstrating the viability of our approach.

inductive logic programming | 1999

1BC: A First-Order Bayesian Classifier

Peter A. Flach; Nicolas Lachiche

In this paper we present 1BC, a first-order Bayesian Classifier. Our approach is to view individuals as structured terms, and to distinguish between structural predicates referring to subterms (e.g. atoms from molecules), and properties applying to one or several of these subterms (e.g. a bond between two atoms). We describe an individual in terms of elementary features consisting of zero or more structural predicates and one property; these features are considered conditionally independent following the usual naive Bayes assumption. 1BC has been implemented in the context of the first-order descriptive learner Tertius, and we describe several experiments demonstrating the viability of our approach.

genetic and evolutionary computation conference | 2009

Coarse grain parallelization of evolutionary algorithms on GPGPU cards with EASEA

Ogier Maitre; Laurent A. Baumes; Nicolas Lachiche; Avelino Corma; Pierre Collet

This paper presents a straightforward implementation of a standard evolutionary algorithm that evaluates its population in parallel on a GPGPU card. Tests done on a benchmark and a real world problem using an old NVidia 8800GTX card and a newer but not top of the range GTX260 card show a roughly 30x (resp. 100x) speedup for the whole algorithm compared to the same algorithm running on a standard 3.6GHz PC. Knowing that much faster hardware is already available, this opens new horizons to evolutionary computation, as search spaces can now be explored 2 or 3 orders of magnitude faster, depending on the number of used GPGPU cards. Since these cards remains very difficult to program, the knowhow has been integrated into the old EASEA language, that can now output code for GPGPU (-cuda option).

Journal of Chemical Information and Modeling | 2006

Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores.

Igor V. Tetko; Vitaly P. Solov'ev; Alexey V. Antonov; Xiaojun Yao; Jean Pierre Doucet; Botao Fan; Frank Hoonakker; Denis Fourches; Piere Jost; Nicolas Lachiche; Alexandre Varnek

A benchmark of several popular methods, Associative Neural Networks (ANN), Support Vector Machines (SVM), k Nearest Neighbors (kNN), Maximal Margin Linear Programming (MMLP), Radial Basis Function Neural Network (RBFNN), and Multiple Linear Regression (MLR), is reported for quantitative-structure property relationships (QSPR) of stability constants logK1 for the 1:1 (M:L) and logbeta2 for 1:2 complexes of metal cations Ag+ and Eu3+ with diverse sets of organic molecules in water at 298 K and ionic strength 0.1 M. The methods were tested on three types of descriptors: molecular descriptors including E-state values, counts of atoms determined for E-state atom types, and substructural molecular fragments (SMF). Comparison of the models was performed using a 5-fold external cross-validation procedure. Robust statistical tests (bootstrap and Kolmogorov-Smirnov statistics) were employed to evaluate the significance of calculated models. The Wilcoxon signed-rank test was used to compare the performance of methods. Individual structure-complexation property models obtained with nonlinear methods demonstrated a significantly better performance than the models built using multilinear regression analysis (MLRA). However, the averaging of several MLRA models based on SMF descriptors provided as good of a prediction as the most efficient nonlinear techniques. Support Vector Machines and Associative Neural Networks contributed in the largest number of significant models. Models based on fragments (SMF descriptors and E-state counts) had higher prediction ability than those based on E-state indices. The use of SMF descriptors and E-state counts provided similar results, whereas E-state indices lead to less significant models. The current study illustrates the difficulties of quantitative comparison of different methods: conclusions based only on one data set without appropriate statistical tests could be wrong.

IEEE Transactions on Geoscience and Remote Sensing | 2014

Active Learning in the Spatial Domain for Remote Sensing Image Classification

André Stumpf; Nicolas Lachiche; Jean-Philippe Malet; N. Kerle; Anne Puissant

Active learning (AL) algorithms have been proven useful in reducing the number of required training samples for remote sensing applications; however, most methods query samples pointwise without considering spatial constraints on their distribution. This may often lead to a spatially dispersed distribution of training points unfavorable for visual image interpretation or field surveys. The aim of this study is to develop region-based AL heuristics to guide user attention toward a limited number of compact spatial batches rather than distributed points. The proposed query functions are based on a tree ensemble classifier and combine criteria of sample uncertainty and diversity to select regions of interest. Class imbalance, which is inherent to many remote sensing applications, is addressed through stratified bootstrap sampling. Empirical tests of the proposed methods are performed with multitemporal and multisensor satellite images capturing, in particular, sites recently affected by large-scale landslide events. The assessment includes an experimental evaluation of the labeling time required by the user and the computational runtime, and a sensitivity analysis of the main algorithm parameters. Region-based heuristics that consider sample uncertainty and diversity are found to outperform pointwise sampling and region-based methods that consider only uncertainty. Reference landslide inventories from five different experts enable a detailed assessment of the spatial distribution of remaining errors and the uncertainty of the reference data.

soft computing | 2012

EASEA: specification and execution of evolutionary algorithms on GPGPU

Ogier Maitre; Frédéric Krüger; Stephane Querry; Nicolas Lachiche; Pierre Collet

EASEA is a framework designed to help non-expert programmers to optimize their problems by evolutionary computation. It allows to generate code targeted for standard CPU architectures, GPGPU-equipped machines as well as distributed memory clusters. In this paper, EASEA is presented by its underlying algorithms and by some example problems. Achievable speedups are also shown onto different NVIDIA GPGPUs cards for different optimization algorithm families.

european conference on genetic programming | 2010

Fast evaluation of GP trees on GPGPU by optimizing hardware scheduling

Ogier Maitre; Nicolas Lachiche; Pierre Collet

This paper shows that it is possible to use General Purpose Graphic Processing Unit cards for a fast evaluation of different Genetic Programming trees on as few as 32 fitness cases by using the hardware scheduling of NVIDIA cards. Depending on the function set, observed speedup ranges between ×50 and ×250 on one half of an NVidia GTX295 GPGPU card, vs a single core of an Intel Quad core Q8200.

International Opensource Geospatial Research Symposium (OGRS 2009) | 2012

A Platform for Spatial Data Labeling in an Urban Context

Julien Lesbegueries; Nicolas Lachiche; Agnès Braud; Grzegorz Skupinski; Anne Puissant; Julien Perret

This chapter presents a platform for classifying urban areas, improved by a machine learning framework able to ease this classification. We propose thanks to this platform an iterative procedure for geographic experts that have to define classes or “labels” and then classify in a semi-automated way. This work is part of the GeOpenSim project and has been developed within the Geoxygene framework.

International Journal on Artificial Intelligence Tools | 2011

A REPRESENTATION TO APPLY USUAL DATA MINING TECHNIQUES TO CHEMICAL REACTIONS — ILLUSTRATION ON THE RATE CONSTANT OF SN2 REACTIONS IN WATER

Frank Hoonakker; Nicolas Lachiche; Alexandre Varnek; Alain Wagner

Chemical reactions always involve several molecules of two types, reactants and products. Existing data mining techniques, eg. Quantitative Structure Activity Relationship (QSAR) methods, deal with individual molecules only. In this article, we propose to use a Condensed Graph of Reaction (CGR) to merge all molecules involved in a reaction into one molecular graph. This allows one to consider reactions as pseudo-molecules and to develop QSAR models based on fragment descriptors. Then ISIDA (In SIlico Design and Analysis) fragment descriptors built from CGRs are used to generate models for the rate constant of SN2 reactions in water, using three usual attribute-value regression algorithms (linear regression, support vector machine, and regression trees). This approach is compared favorably to two state-of-the-art relational data mining techniques.

Explore More