Hanene Azzag | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hanene Azzag is active.

Explore More

Publication

Featured researches published by Hanene Azzag.

electronic commerce | 2007

A New Approach of Data Clustering Using a Flock of Agents

Fabien Picarougne; Hanene Azzag; Gilles Venturini; Christiane Guinot

This paper presents a new bio-inspired algorithm (FClust) that dynamically creates and visualizes groups of data. This algorithm uses the concepts of a flock of agents that move together in a complex manner with simple local rules. Each agent represents one data. The agents move together in a 2D environment with the aim of creating homogeneous groups of data. These groups are visualized in real time, and help the domain expert to understand the underlying structure of the data set, like for example a realistic number of classes, clusters of similar data, isolated data. We also present several extensions of this algorithm, which reduce its computational cost, and make use of a 3D display. This algorithm is then tested on artificial and real-world data, and a heuristic algorithm is used to evaluate the relevance of the obtained partitioning.

Neural Networks | 2016

A new Growing Neural Gas for clustering data streams

Mohammed Ghesmoune; Mustapha Lebbah; Hanene Azzag

Clustering data streams is becoming the most efficient way to cluster a massive dataset. This task requires a process capable of partitioning observations continuously with restrictions of memory and time. In this paper we present a new algorithm, called G-Stream, for clustering data streams by making one pass over the data. G-Stream is based on growing neural gas, that allows us to discover clusters of arbitrary shapes without any assumptions on the number of clusters. By using a reservoir, and applying a fading function, the quality of clustering is improved. The performance of the proposed algorithm is evaluated on public datasets.

pacific-asia conference on knowledge discovery and data mining | 2015

Clustering Over Data Streams Based on Growing Neural Gas

Mohammed Ghesmoune; Mustapha Lebbah; Hanene Azzag

Clustering data streams requires a process capable of partitioning observations continuously with restrictions of memory and time. In this paper we present a new algorithm, called G-Stream, for clustering data streams by making one pass over the data. G-Stream is based on growing neural gas, that allows us to discover clusters of arbitrary shape without any assumptions on the number of clusters. By using a reservoir, and applying a fading function, the quality of clustering is improved. The performance of the proposed algorithm is evaluated on public data sets.

Procedia Computer Science | 2015

Micro-Batching Growing Neural Gas for Clustering Data Streams Using Spark Streaming☆

Mohammed Ghesmoune; Mustapha Lebbah; Hanene Azzag

Abstract In recent years, the data stream clustering problem has gained considerable attention in the literature. Clustering data streams requires a process capable of partitioning observations continuously while taking into account restrictions of memory and time. In this paper we present MBG-Stream, a Micro-Batching version of the growing neural gas approach, aimed to clustering data streams by making one pass over the data. MBG-Stream allows us to discover clusters of arbitrary shapes without any assumptions on the number of clusters. The proposed algorithm is implemented on a “distributed” streaming platform, the Spark Streaming API, and its performance is evaluated on public data sets.

international conference on neural information processing | 2014

G-Stream: Growing Neural Gas over Data Stream

Mohammed Ghesmoune; Hanene Azzag; Mustapha Lebbah

Streaming data clustering is becoming the most efficient way to cluster a very large data set. In this paper we present a new approach, called G-Stream, for topological clustering of evolving data streams. G-Stream allows one to discover clusters of arbitrary shape without any assumption on the number of clusters and by making one pass over the data. The topological structure is represented by a graph wherein each node represents a set of “close” data points and neighboring nodes are connected by edges. The use of the reservoir, to hold, temporarily, the very distant data points from the current prototypes, avoids needless movements of the nearest nodes to data points and therefore, improving the quality of clustering. The performance of the proposed algorithm is evaluated on both synthetic and real-world data sets.

international world wide web conferences | 2005

Automatic generation of web portals using artificial ants

Hanene Azzag; Gilles Venturini; Christiane Guinot

We present in this work a new model (named AntTree) based on artificial ants for document hierarchical clustering. This model is inspired from the self-assembly behavior of real ants. We have simulated this behavior to build a hierarchical tree-structured partitioning of a set of documents, according to the similarities between these documents. We have successfully compared our results to those obtained by ascending hierarchical clustering.

Machine Learning | 2017

Big Data: from collection to visualization

Mohammed Ghesmoune; Hanene Azzag; Salima Benbernou; Mustapha Lebbah; Tarn Duong; Mourad Ouziri

Organisations are increasingly relying on Big Data to provide the opportunities to discover correlations and patterns in data that would have previously remained hidden, and to subsequently use this new information to increase the quality of their business activities. In this paper we present a ‘story’ of Big Data from the initial data collection and to the end visualization, passing by the data fusion, and the analysis and clustering tasks. For this, we present a complete work flow on (a) how to represent the heterogeneous collected data using the high performance RDF language, how to perform the fusion of the Big Data in RDF by resolving the issue of entity disambiguity and how to query those data to provide more relevant and complete knowledge and (b) as the data are received in data streams, we propose batchStream, a Micro-Batching version of the growing neural gas approach, which is capable of clustering data streams with a single pass over the data. The batchStream algorithm allows us to discover clusters of arbitrary shapes without any assumptions on the number of clusters. This Big Data work flow is implemented in the Spark platform and we demonstrate it on synthetic and real data.

international joint conference on neural network | 2016

Distributed mean shift clustering with approximate nearest neighbours.

Gaël Beck; Tarn Duong; Hanene Azzag; Mustapha Lebbah

We introduce an efficient distributed implementation of nearest neighbour mean shift clustering (NNMS). The computationally intensive nature of NNMS has so far restricted its application to complex data sets where a flexible clustering with non-ellipsoidal clusters would be beneficial. A parallel implementation of the standard serial NNMS algorithm on its own brings insufficient performance gains so we introduce two further algorithmic improvements: a normal scale (NS) choice of the optimal number of nearest neighbours, and locality sensitive hashing (LSH) to approximate nearest neighbour searches. Combining these improvements into a single distributed algorithm DNNMS offers the potential for an efficient method for Big Data Clustering.

international conference on machine learning and applications | 2010

Map-TreeMaps: A New Approach for Hierarchical and Topological Clustering

Hanene Azzag; Mustapha Lebbah; Aymen Arfaoui

We present in this paper a new clustering method which provides self-organization of hierarchical clustering. This method represents large datasets on a forest of original trees which are projected on a simple 2D geometric relationship using tree map representation. The obtained partition is represented by a map of tree maps, which define a tree of data. In this paper, we provide the rules that build a tree of node/data by using distance between data in order to decide where connect nodes. Visual and empirical results based on both synthetic and real datasets from the UCI repository, are given and discussed.

international world wide web conferences | 2006

Generating maps of web pages using cellular automata

Hanene Azzag; David Ratsimba; David Da Costa; Gilles Venturini; Christiane Guinot

The aim of web pages visualization is to present in a very informative and interactive way a set of web documents to the user in order to let him or her navigate through these documents. In the web context, this may correspond to several users tasks: displaying the results of a search engine, or visualizing a graph of pages such as a hypertext or a surf map. In addition to web pages visualization, web pages clustering also greatly improves the amount of information presented to the user by highlighting the similarities between the documents [6]. In this paper we explore the use of a cellular automata (CA) to generate such maps of web pages.

Explore More