Haytham Elghazel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haytham Elghazel is active.

Explore More

Publication

Featured researches published by Haytham Elghazel.

international syposium on methodologies for intelligent systems | 2006

A new clustering approach for symbolic data and its validation: application to the healthcare data

Haytham Elghazel; Véronique Deslandres; Mohand-Said Hacid; Alain Dussauchoy; Hamamache Kheddouci

Graph coloring is used to characterize some properties of graphs. A b-coloring of a graph G (using colors 1,2,...,k) is a coloring of the vertices of G such that (i) two neighbors have different colors (proper coloring) and (ii) for each color class there exists a dominating vertex which is adjacent to all other k-1 color classes. In this paper, based on a b-coloring of a graph, we propose a new clustering technique. Additionally, we provide a cluster validation algorithm. This algorithm aims at finding the optimal number of clusters by evaluating the property of color dominating vertex. We adopt this clustering technique for discovering a new typology of hospital stays in the French healthcare system.

Machine Learning | 2015

Unsupervised feature selection with ensemble learning

Haytham Elghazel; Alex Aussem

In this paper, we show that the way internal estimates are used to measure variable importance in Random Forests are also applicable to feature selection in unsupervised learning. We propose a new method called Random Cluster Ensemble (RCE for short), that estimates the out-of-bag feature importance from an ensemble of partitions. Each partition is constructed using a different bootstrap sample and a random subset of the features. We provide empirical results on nineteen benchmark data sets indicating that RCE, boosted with a recursive feature elimination scheme (RFE) (Guyon and Elisseeff, Journal of Machine Learning Research, 3:1157–1182, 2003), can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art supervised and unsupervised algorithms, with a very limited subset of features. The method shows promise to deal with very large domains. All results, datasets and algorithms are available on line (http://perso.univ-lyon1.fr/haytham.elghazel/RCE.zip).

Pattern Recognition Letters | 2012

A semi-supervised feature ranking method with ensemble learning

Fazia Bellal; Haytham Elghazel; Alex Aussem

We consider the problem of using a large amount of unlabeled data to improve the efficiency of feature selection in high-dimension when only a small amount of labeled examples is available. We propose a new method called semi-supervised ensemble learning guided feature ranking method (SEFR for short), that combines a bagged ensemble of standard semi-supervised approaches with a permutation-based out-of-bag feature importance measure that takes into account both labeled and unlabeled data. We provide empirical results on several benchmark data sets indicating that SEFR can lead to significant improvement over state-of-the-art supervised and semi-supervised algorithms.

Expert Systems With Applications | 2016

Ensemble Multi-label Text Categorization based on Rotation Forest and Latent Semantic Indexing

Haytham Elghazel; Alex Aussem; Ouadie Gharroudi; Wafa Saadaoui

Abstract Text categorization has gained increasing popularity in the last years due the explosive growth of multimedia documents. As a document can be associated with multiple non-exclusive categories simultaneously (e.g., Virus, Health, Sports, and Olympic Games), text categorization provides many opportunities for developing novel multi-label learning approaches devoted specifically to textual data. In this paper, we propose an ensemble multi-label classification method for text categorization based on four key ideas: (1) performing Latent Semantic Indexing based on distinct orthogonal projections on lower-dimensional spaces of concepts; (2) random splitting of the vocabulary; (3) document bootstrapping; and (4) the use of BoosTexter as a powerful multi-label base learner for text categorization to simultaneously encourage diversity and individual accuracy in the committee. Diversity of the ensemble is promoted through random splits of the vocabulary that leads to different orthogonal projections on lower-dimensional latent concept spaces. Accuracy of the committee members is promoted through the underlying latent semantic structure uncovered in the text. The combination of both rotation-based ensemble construction and Latent Semantic Indexing projection is shown to bring about significant improvements in terms of Average Precision, Coverage, Ranking loss and One error compared to five state-of-the-art approaches across 14 real-word textual data sets covering a wide variety of topics including health, education, business, science and arts.

international conference on data mining | 2010

Feature Selection for Unsupervised Learning Using Random Cluster Ensembles

Haytham Elghazel; Alex Aussem

In this paper, we propose another extension of the Random Forests paradigm to unlabeled data, leading to localized unsupervised feature selection (FS). We show that the way internal estimates are used to measure variable importance in Random Forests are also applicable to FS in unsupervised learning. We first illustrate the clustering performance of the proposed method on various data sets based on widely used external criteria of clustering quality. We then assess the accuracy and the scalability of the FS procedure on UCI and real labeled data sets and compare its effectiveness against other FS methods.

international conference on innovations in information technology | 2011

Graph modeling based video event detection

Najib Ben Aoun; Haytham Elghazel; Chokri Ben Amar

Video processing and analysis have been an interesting field in research and industry. Information detection or retrieval were a challenged task especially with the spread of multimedia applications and the increased number of the video acquisition devices such as the surveillance cameras, phones cameras. These have produced a large amount of video data which are also diversified and complex. This is what makes event detection in video a difficult task. Many video event detection methods were developed which are composed of two fundamental parts: video indexing and video classification. In this paper, we will introduce a new video event detection system based on graphs. Our system models the video frame as a graph in addition to a motion description. Thereafter, these models were classified and events are detected. Experimental results proved the effectiveness and the robustness of our system.

canadian conference on artificial intelligence | 2014

A Comparison of Multi-Label Feature Selection Methods Using the Random Forest Paradigm

Ouadie Gharroudi; Haytham Elghazel; Alex Aussem

In this paper, we discuss three wrapper multi-label feature selection methods based on the Random Forest paradigm. These variants differ in the way they consider label dependence within the feature selection process. To assess their performance, we conduct an extensive experimental comparison of these strategies against recently proposed approaches using seven benchmark multi-label data sets from different domains. Random Forest handles accurately the feature selection in the multi-label context. Surprisingly, taking into account the dependence between labels in the context of ensemble multi-label feature selection was not found very effective.

Expert Systems With Applications | 2014

A hybrid algorithm for Bayesian network structure learning with application to multi-label learning

Maxime Gasse; Alex Aussem; Haytham Elghazel

We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PCs ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.

european conference on machine learning | 2012

An experimental comparison of hybrid algorithms for bayesian network structure learning

Maxime Gasse; Alex Aussem; Haytham Elghazel

We present a novel hybrid algorithm for Bayesian network structure learning, called Hybrid HPC (H2PC). It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. It is based on a subroutine called HPC, that combines ideas from incremental and divide-and-conquer constraint-based methods to learn the parents and children of a target variable. We conduct an experimental comparison of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning, on several benchmarks with various data sizes. Our extensive experiments show that H2PC outperforms MMHC both in terms of goodness of fit to new data and in terms of the quality of the network structure itself, which is closer to the true dependence structure of the data. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.

computer analysis of images and patterns | 2011

Graph aggregation based image modeling and indexing for video annotation

Najib Ben Aoun; Haytham Elghazel; Mohand-Said Hacid; Chokri Ben Amar

With the rapid growth of video multimedia databases and the lack of textual descriptions for many of them, video annotation became a highly desired task. Conventional systems try to annotate a video query by simply finding its most similar videos in the database. Although the video annotation problem has been tackled in the last decade, no attention has been paid to the problem of assembling video keyframes in a sensed way to provide an answer of the given video query when no single candidate video turns out to be similar to the query. In this paper, we introduce a graph based image modeling and indexing system for video annotation. Our system is able to improve the video annotation task by assembling a set of graphs representing different keyframes of different videos, to compose the video query. The experimental results demonstrate the effectiveness of our system to annotate videos that are not possibly annotated by classical approaches.

Explore More