Fabien Torre | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fabien Torre is active.

Explore More

Publication

Featured researches published by Fabien Torre.

international conference on machine learning and applications | 2010

Boosting Multi-Task Weak Learners with Applications to Textual and Social Data

Jean Baptiste Faddoul; Boris Chidlovskii; Fabien Torre; Rémi Gilleron

Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.

european conference on machine learning | 2012

Learning multiple tasks with boosted decision trees

Jean Baptiste Faddoul; Boris Chidlovskii; Rémi Gilleron; Fabien Torre

We address the problem of multi-task learning with no label correspondence among tasks. Learning multiple related tasks simultaneously, by exploiting their shared knowledge can improve the predictive performance on every task. We develop the multi-task Adaboost environment with Multi-Task Decision Trees as weak classifiers. We first adapt the well known decision tree learning to the multi-task setting. We revise the information gain rule for learning decision trees in the multi-task setting. We use this feature to develop a novel criterion for learning Multi-Task Decision Trees. The criterion guides the tree construction by learning the decision rules from data of different tasks, and representing different degrees of task relatedness. We then modify MT-Adaboost to combine Multi-task Decision Trees as weak learners. We experimentally validate the advantage of the new technique; we report results of experiments conducted on several multi-task datasets, including the Enron email set and Spam Filtering collection.

machine learning and data mining in pattern recognition | 2005

SSC: statistical subspace clustering

Laurent Candillier; Isabelle Tellier; Fabien Torre; Olivier Bousquet

Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. This is a particularly important challenge with high dimensional data where the curse of dimensionality occurs. It has also the benefit of providing smaller descriptions of the clusters found. Existing methods only consider numerical databases and do not propose any method for clusters visualization. Besides, they require some input parameters difficult to set for the user. The aim of this paper is to propose a new subspace clustering algorithm, able to tackle databases that may contain continuous as well as discrete attributes, requiring as few user parameters as possible, and producing an interpretable output. We present a method based on the use of the well-known EM algorithm on a probabilistic model designed under some specific hypotheses, allowing us to present the result as a set of rules, each one defined with as few relevant dimensions as possible. Experiments, conducted on artificial as well as real databases, show that our algorithm gives robust results, in terms of classification and interpretability of the output.

european conference on machine learning | 2006

Cascade evaluation of clustering algorithms

Laurent Candillier; Isabelle Tellier; Fabien Torre; Olivier Bousquet

This paper is about the evaluation of the results of clustering algorithms, and the comparison of such algorithms. We propose a new method based on the enrichment of a set of independent labeled datasets by the results of clustering, and the use of a supervised method to evaluate the interest of adding such new information to the datasets. We thus adapt the cascade generalization [1] paradigm in the case where we combine an unsupervised and a supervised learner. We also consider the case where independent supervised learnings are performed on the different groups of data objects created by the clustering [2]. We then conduct experiments using different supervised algorithms to compare various clustering algorithms. And we thus show that our proposed method exhibits a coherent behavior, pointing out, for example, that the algorithms based on the use of complex probabilistic models outperform algorithms based on the use of simpler models.

web intelligence | 2006

Interactive Tuples Extraction from Semi-Structured Data

Rémi Gilleron; Patrick Marty; Marc Tommasi; Fabien Torre

This paper studies from a machine learning viewpoint the problem of extracting tuples of a target n-ary relation from tree structured data like XML or XHTML documents. Our system can extract, without any post-processing, tuples for all data structures including nested, rotated and cross tables. The wrapper induction algorithm we propose is based on two main ideas. It is incremental: partial tuples are extracted by increasing length. It is based on a representation-enrichment procedure: partial tuples of length i are encoded with the knowledge of extracted tuples of length i-1. The algorithm is then set in a friendly interactive wrapper induction system for Web documents. We evaluate our system on several information extraction tasks over corporate Web sites. It achieves state-of-the-art results on simple data structures and succeeds on complex data structures where previous approaches fail. Experiments also show that our interactive framework significantly reduces the number of user interactions needed to build a wrapper

european conference on machine learning | 1997

Natural Ideal Operators in Inductive Logic Programming

Fabien Torre; Céline Rouveirol

We present in this paper the original notion of natural relation, a quasi order that extends the idea of generality order: it allows the sound and dynamic pruning of hypotheses that do not satisfy a property, be it completeness or correctness with respect to the training examples, or hypothesis language restriction. Natural relations for conjunctions of such properties are characterized. Learning operators that satisfy these complex natural relations allow pruning with respect to this set of properties to take place before inappropriate hypotheses are generated. Once the natural relation is defined that optimally prunes the search space with respect to a set of properties, we discuss the existence of ideal operators for the search space ordered by this natural relation. We have adapted the results from [vdLNC94a] on the non-existence of ideal operators to those complex natural relations. We prove those nonexistence conditions do not apply to some of those natural relations, thus overcoming the previous negative results about ideal operators for space ordered by θ-subsumption only.

Archive | 2004