Guénaël Cabanes | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guénaël Cabanes is active.

Explore More

Publication

Featured researches published by Guénaël Cabanes.

international symposium on neural networks | 2008

A local density-based simultaneous two-level algorithm for topographic clustering

Guénaël Cabanes; Younès Bennani

Determining the optimum number of clusters is an ill posed problem for which there is no simple way of knowing that number without a priori knowledge. The purpose of this paper is to provide a simultaneous two-level clustering algorithm based on self organizing map, called DS2L-SOM, which learn at the same time the structure of the data and its segmentation. The algorithm is based both on distance and density measures in order to accomplish a topographic clustering. An important feature of the algorithm is that the cluster number is discovered automatically. A great advantage of the proposed algorithm, compared to the common partitional clustering methods, is that it is not restricted to convex clusters but can recognize arbitrarily shaped clusters and touching clusters. The validity and the stability of this algorithm are superior to standard two-level clustering methods such as SOM+K-means and SOM+hierarchical agglomerative clustering. This is demonstrated on a set of critical clustering problems.

international conference on machine learning and applications | 2007

A simultaneous two-level clustering algorithm for automatic model selection

Guénaël Cabanes; Younès Bennani

One of the most crucial questions in many real-world cluster applications is determining a suitable number of clusters, also known as the model selection problem. Determining the optimum number of clusters is an ill posed problem for which there is no simple way of knowing that number without a priori knowledge. In this paper we propose a new two-level clustering algorithm based on self organizing map, called S2L-SOM, which allows an automatic determination of the number of clusters during learning. Estimating true numbers of clusters is related to the cluster stability which involved the validity of clusters generated by the learning algorithm. To measure this stability we use the sub-sampling method. The great advantage of our proposed algorithm, compared to the common partitional clustering methods, is that it is not restricted to convex clusters but can recognize arbitrarily shaped clusters. The validity of this algorithm is superior to standard two-level clustering methods such as SOM+k-means and SOM+Hierarchical agglomerative clustering. This is demonstrated on a set of critical clustering problems.

Neural Networks | 2012

2012 Special Issue: Enriched topological learning for cluster detection and visualization

Guénaël Cabanes; Younès Bennani; Dominique Fresneau

The exponential growth of data generates terabytes of very large databases. The growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools. Thus, it becomes crucial to have methods able to construct a condensed description of the properties and structure of data, as well as visualization tools capable of representing the data structure from these condensed descriptions. The purpose of our work described in this paper is to develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm. We then introduce a visualization tool that can enhance the structure within and between groups in data. We show, using some artificial and real databases, the relevance of the proposed approach.

Pattern Recognition | 2013

A new topological clustering algorithm for interval data

Guénaël Cabanes; Younès Bennani; Renaud Destenay; André Hardy

Abstract Clustering is a very powerful tool for automatic detection of relevant sub-groups in unlabeled data sets. In this paper we focus on interval data: i.e., where the objects are defined as hyper-rectangles. We propose here a new clustering algorithm for interval data, based on the learning of a Self-Organizing Map. The major advantage of our approach is that the number of clusters to find is determined automatically; no a priori hypothesis for the number of clusters is required. Experimental results confirm the effectiveness of the proposed algorithm when applied to interval data.

Archive | 2010

Learning the Number of Clusters in Self Organizing Map

Guénaël Cabanes; Younès Bennani

The Self-Organizing Map (SOM: Kohonen (1984, 2001)) is a neuro-computational algorithm to map high-dimensional data to a two-dimensional space through a competitive and unsupervised learning process. Self-Organizing Maps differ from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space. This unsupervised learning algorithm is a popular nonlinear technique for dimensionality reduction and data visualization. The SOM is often used as a first phase for unsupervised classification (i.e. clustering). Clustering methods are able to perform an automatic detection of relevant sub-groups or clusters in unlabeled data sets, when one does not have prior knowledge about the hidden structure of these data. Patterns in the same cluster should be similar to each other, while patterns in different clusters should not (internal homogeneity and the external separation). Clustering plays an indispensable role for understanding various phenomena described by data sets. A clustering problem can be defined as the task of partitioning a set of objects into a collection of mutually disjoint subsets. Clustering is a segmentation problem which is considered as one of the most challenging problems in unsupervised learning. Various approaches have been proposed to solve the problem (Jain & Dubes, 1988). An efficient method to grouping problems is based on the learning of a Self-Organizing Map. In the first phase of the process, the standard SOM approach is used to compute a set of reference vectors (prototypes) representing local means of the data. In the second phase, the obtained prototypes are grouped to form the final partitioning using a traditional clustering method (e.g. K-means or hierarchical methods). Such an approach is called a twolevel clustering method. In this work, we focus particular attention on two-level clustering algorithms. One of the most crucial questions in many real-world cluster applications is how to determine a suitable number of clusters K, also known as the model selection problem. Without a priori knowledge there is no simple way of knowing that number. The purpose of our work is to provide a simultaneous two-level clustering approach using SOM, by learning at the same time the structure of the data and its segmentation, using both distance and density information. This new clustering algorithm assumes that a cluster is a dense region of objects surrounded by a region of low density (Yue et al., 2004; Ultsch, 2005; Ocsa et al., 2007; Pamudurthy et al., 2007). This approach is very effective when the clusters are 2

international conference on neural information processing | 2010

Learning topological constraints in self-organizing map

Guénaël Cabanes; Younès Bennani

The Self-Organizing Map (SOM) is a popular algorithm to analyze the structure of a dataset. However, some topological constraints of the SOM are fixed before the learning and may not be relevant regarding to the data structure. In this paper we propose to improve the SOM performance with a new algorithm which learn the topological constraints of the map using data structure information. Experiments on artificial and real databases show that algorithm achieve better results than SOM. This is not the case with trivial topological constraint relaxation because of the high increase of the Topological error.

ieee symposium series on computational intelligence | 2015

Collaborative Clustering: How to Select the Optimal Collaborators?

Parisa Rastin; Guénaël Cabanes; Nistor Grozavu; Younès Bennani

The aim of collaborative clustering is to reveal the common underlying structure of data spread across multiple data sites by applying clustering techniques. The idea of Collaborative Clustering is that each collaborator share some information about the segmentation (structure) of its local data and improve its own clustering with the information provided by the other collaborators. This paper analyses the impact of the Quality of the potential Collaborators to the quality of the collaboration for a Topological Collaborative Clustering Algorithm based on the learning of a Self-Organizing Map. Experimental analysis on four real vector data-sets showed that the diversity between collaborators impact the quality of the collaboration. We also showed that the internal indexes of quality are a good estimator of the increase of quality due to the collaboration.

international symposium on neural networks | 2012

Change detection in data streams through unsupervised learning

Guénaël Cabanes; Younès Bennani

In many cases, databases are in constant evolution, new data is arriving continuously. Data streams pose several unique problems that make obsolete the applications of standard data analysis methods. Indeed, these databases are constantly on-line, growing with the arrival of new data. In addition, the probability distribution associated with the data may change over time. We propose in this paper a method of synthetic representation of the data structure for efficient storage of information, and a measure of dissimilarity between these representations for the detection of change in the stream structure.

Pattern Recognition | 2017

Entropy based probabilistic collaborative clustering

Basarab Matei; Guénaël Cabanes; Nistor Grozavu; Younès Bennani; Antoine Cornuéjols

Abstract Unsupervised machine learning approaches involving several clustering algorithms working together to tackle difficult data sets are a recent area of research with a large number of applications such as clustering of distributed data, multi-expert clustering, multi-scale clustering analysis or multi-view clustering. Most of these frameworks can be regrouped under the umbrella of collaborative clustering, the aim of which is to reveal the common underlying structures found by the different algorithms while analyzing the data. Within this context, the purpose of this article is to propose a collaborative framework lifting the limitations of many of the previously proposed methods: Our proposed collaborative learning method makes possible for a wide range of clustering algorithms from different families to work together based solely on their clustering solutions, thus lifting previous limitation requiring identical prototypes between the different collaborators. Our proposed framework uses a variational EM as its theoretical basis for the collaboration process and can be applied to any of the previously mentioned collaborative contexts. In this article, we give the main ideas and theoretical foundations of our method, and we demonstrate its effectiveness in a series of experiments on real data sets as well as data sets from the literature.

international symposium on neural networks | 2014

Diversity analysis in collaborative clustering

Nistor Grozavu; Guénaël Cabanes; Younès Bennani

The aim of collaborative clustering is to reveal the common structure of data which are distributed on different sites. The topological collaborative clustering, based on Self-Organizing Maps (SOM) is an unsupervised learning method which is able to use the output of other SOMs from other sites during the learning. This paper investigates the impact of the diversity between collaborators on the collaborations quality and presents a study of different diversity indexes for collaborative clustering. Based on experiments on artificial and real datasets, we demonstrated that the quality and the diversity of the collaboration can have an important impact on the quality of the collaboration and that not all diversity indexes are relevant for this task.

Explore More