Fabien Torre
university of lille
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabien Torre.
international conference on machine learning and applications | 2010
Jean Baptiste Faddoul; Boris Chidlovskii; Fabien Torre; Rémi Gilleron
Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.
european conference on machine learning | 2012
Jean Baptiste Faddoul; Boris Chidlovskii; Rémi Gilleron; Fabien Torre
We address the problem of multi-task learning with no label correspondence among tasks. Learning multiple related tasks simultaneously, by exploiting their shared knowledge can improve the predictive performance on every task. We develop the multi-task Adaboost environment with Multi-Task Decision Trees as weak classifiers. We first adapt the well known decision tree learning to the multi-task setting. We revise the information gain rule for learning decision trees in the multi-task setting. We use this feature to develop a novel criterion for learning Multi-Task Decision Trees. The criterion guides the tree construction by learning the decision rules from data of different tasks, and representing different degrees of task relatedness. We then modify MT-Adaboost to combine Multi-task Decision Trees as weak learners. We experimentally validate the advantage of the new technique; we report results of experiments conducted on several multi-task datasets, including the Enron email set and Spam Filtering collection.
machine learning and data mining in pattern recognition | 2005
Laurent Candillier; Isabelle Tellier; Fabien Torre; Olivier Bousquet
Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. This is a particularly important challenge with high dimensional data where the curse of dimensionality occurs. It has also the benefit of providing smaller descriptions of the clusters found. Existing methods only consider numerical databases and do not propose any method for clusters visualization. Besides, they require some input parameters difficult to set for the user. The aim of this paper is to propose a new subspace clustering algorithm, able to tackle databases that may contain continuous as well as discrete attributes, requiring as few user parameters as possible, and producing an interpretable output. We present a method based on the use of the well-known EM algorithm on a probabilistic model designed under some specific hypotheses, allowing us to present the result as a set of rules, each one defined with as few relevant dimensions as possible. Experiments, conducted on artificial as well as real databases, show that our algorithm gives robust results, in terms of classification and interpretability of the output.
european conference on machine learning | 2006
Laurent Candillier; Isabelle Tellier; Fabien Torre; Olivier Bousquet
This paper is about the evaluation of the results of clustering algorithms, and the comparison of such algorithms. We propose a new method based on the enrichment of a set of independent labeled datasets by the results of clustering, and the use of a supervised method to evaluate the interest of adding such new information to the datasets. We thus adapt the cascade generalization [1] paradigm in the case where we combine an unsupervised and a supervised learner. We also consider the case where independent supervised learnings are performed on the different groups of data objects created by the clustering [2]. We then conduct experiments using different supervised algorithms to compare various clustering algorithms. And we thus show that our proposed method exhibits a coherent behavior, pointing out, for example, that the algorithms based on the use of complex probabilistic models outperform algorithms based on the use of simpler models.
web intelligence | 2006
Rémi Gilleron; Patrick Marty; Marc Tommasi; Fabien Torre
This paper studies from a machine learning viewpoint the problem of extracting tuples of a target n-ary relation from tree structured data like XML or XHTML documents. Our system can extract, without any post-processing, tuples for all data structures including nested, rotated and cross tables. The wrapper induction algorithm we propose is based on two main ideas. It is incremental: partial tuples are extracted by increasing length. It is based on a representation-enrichment procedure: partial tuples of length i are encoded with the knowledge of extracted tuples of length i-1. The algorithm is then set in a friendly interactive wrapper induction system for Web documents. We evaluate our system on several information extraction tasks over corporate Web sites. It achieves state-of-the-art results on simple data structures and succeeds on complex data structures where previous approaches fail. Experiments also show that our interactive framework significantly reduces the number of user interactions needed to build a wrapper
european conference on machine learning | 1997
Fabien Torre; Céline Rouveirol
We present in this paper the original notion of natural relation, a quasi order that extends the idea of generality order: it allows the sound and dynamic pruning of hypotheses that do not satisfy a property, be it completeness or correctness with respect to the training examples, or hypothesis language restriction. Natural relations for conjunctions of such properties are characterized. Learning operators that satisfy these complex natural relations allow pruning with respect to this set of properties to take place before inappropriate hypotheses are generated. Once the natural relation is defined that optimally prunes the search space with respect to a set of properties, we discuss the existence of ideal operators for the search space ordered by this natural relation. We have adapted the results from [vdLNC94a] on the non-existence of ideal operators to those complex natural relations. We prove those nonexistence conditions do not apply to some of those natural relations, thus overcoming the previous negative results about ideal operators for space ordered by θ-subsumption only.
Archive | 2004
Fabien Torre
Revue Dintelligence Artificielle | 2005
Fabien Torre
Conférence d'Apprentissage | 2006
Laurent Candillier; Isabelle Tellier; Fabien Torre; Olivier Bousquet
11e Conférence francophone sur l'Apprentissage automatique (CAp'2009) | 2009
Fabien Torre; Alain Terlutte