Hicham Behja
École Normale Supérieure
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hicham Behja.
computer and information technology | 2013
Asmaa Benghabrit; Brahim Ouhbi; Hicham Behja; Bouchra Frikh
The explosive growth of information stored in unstructured texts created a great demand for new and powerful tools to acquire useful information, such as text mining. Document clustering is one of its the powerful methods and by which document retrieval, organization and summarization can be achieved. However, it represents a challenge when dealing with a big number of data due to high dimensionality of the feature space and to the semantic correlation between features. In this paper, we propose a new sequential document clustering algorithm that uses a statistical and semantic feature selection methods. The semantic process was proposed to improve the frequency mechanism with the semantic relations of the text documents. The proposed algorithm selects iteratively relevant features and performs clustering until convergence. To evaluate its performance, experiments on two corpora have been conducted. The obtained results show that the performance of our algorithm is superior to that obtained by the existing algorithms.
2010 International Conference on Machine and Web Intelligence | 2010
Hicham Behja; El Moukhtar Zemmouri; Abdelaziz Marzak
In this paper, we propose a new approach that makes the viewpoint notion explicit in a multiview Knowledge Discovery in Databases (KDD) process. We define a viewpoint in KDD as an analysts perception of a KDD process, which refers to his own knowledge. Our purpose is to facilitate both the reusability and adaptability of a KDD process, and to reduce its complexity whilst maintaining the trace of the past analysis in terms of viewpoints. We also propose a viewpoint-based conceptual model for KDD process that integrates both the analyzed and the analyst domain knowledge.
ieee international colloquium on information science and technology | 2014
Lamiae Demraoui; Hicham Behja; El Moukhtar Zemmouri; Rachid Ben Abbou
The key to successful integration and interoperability between applications and software products depends on the intelligent use and management of metadata. This can be accomplished by the use of the CWM standard which provides a mechanism for exchanging metadata in the data warehousing and business intelligence domain. As for, it is known that the development and maintenance of information systems based on the users requirements play a strategic role in the making decision process. In this paper we envisage to integrate the notion of users viewpoint into the CWM standard to allow it reuse.
ieee international colloquium on information science and technology | 2014
Asmaa Benghabrit; Brahim Ouhbi; El Moukhtar Zemmouri; Bouchra Frikh; Hicham Behja
Feature selection is not only a key to handle the high dimensionality phenomenon caused by the vector space model representation, but mainly an efficient technique to reduce the noise generated by the irrelevant and redundant terms. However, in order to effectively capture the most important features, both the semantic and the statistical information within the feature space should be taken into account. Thereby, we propose a sequential and a hybrid clustering and feature selection approaches that combines statistical and semantic feature weight estimation in order to select the most informative features. We first perform a comparative study on powerful statistical feature selection methods and an analysis was done for the semantic methods. Then, we extract the best combination of statistical and semantic methods for the sequential and hybrid approaches. Detailed experimental results on three different data sets are provided in this paper.
Next Generation Networks and Services (NGNS), 2014 Fifth International Conference on | 2014
Lamiae Demraoui; Hicham Behja; El Moukhtar Zemmouri; Rachid Benabbou
The key to successful integration and interoperability between applications and software products depends on the intelligent use and management of metadata. This can be accomplished by the use of the CWM standard which provides a mechanism for exchanging metadata in the data warehousing and business intelligence domain. As for, it is known that the development and maintenance of information systems based on the users interaction play a strategic role in the decision making process. In this paper we investigate the integration of users viewpoint into the CWM standard to enhance reusability and interoperability of metadata during data warehouse design process.
2012 Next Generation Networks and Services (NGNS) | 2012
El Moukthtar Zemmouri; Hicham Behja; Brahim Ouhbi; Brigitte Trousse; Abdelaziz Marzak; Youssef Benghabrit
A data mining project is usually held by several actors (domain experts, data analysts, KDD experts ...), each with a different viewpoint. In this paper we propose to enhance coordination and knowledge sharing between actors of a multi-view KDD analysis through a goal driven modeling of interactions between viewpoints. After a brief review of our approach of viewpoint in KDD, we will first develop a Goal Model that allows identification and representation of business objectives during the business understanding step of KDD process. Then, based on this goal model, we define a set of relations between viewpoints of a multi-view analysis; namely equivalence, inclusion, conflict and requirement.
information integration and web-based applications & services | 2016
Brahim Ouhbi; Mostafa Kamoune; Bouchra Frikh; El Moukhtar Zemmouri; Hicham Behja
Systematic review is the scientific process that provides reliable answers to a particular research question. There is a significant shift from using manual human approach to decision support tools that provides a semi-automated screening phase by reducing the required time and effort. Text classification is useful in determining the statistical significance level of association rules to reduce workload in the systematic review. Several approaches to generate a Rule set for rule based classifiers were proposed in the literature. In this paper, we show that statistic as well as semantic measures of a rule can be combined and effectively computed as a hybrid feature selection rule measure (HFSRM). Moreover, we propose a new algorithm called Rules7-hybrid feature selection (Rules7-HFSRM) by combining the classical algorithm Rules7 and the HFSRM and then used it on the systematic review problem. Our results show that our algorithm significantly outperforms the state-of-the-art benchmark algorithms in the systematic review context.
Next Generation Networks and Services (NGNS), 2014 Fifth International Conference on | 2014
Asmaa Benghabrit; Brahim Ouhbi; El Moukhtar Zemmouri; Bouchra Frikh; Hicham Behja
Knowing that not all the features in a dataset are important since some are redundant or irrelevant, the use of feature selection, an effective dimensionality reduction technique, is essential for web document clustering. For the clustering process, it represents the task of selecting important features for the underlying clusters. Therefore in order to pilot the web document clustering process, we propose a hybrid feature selection algorithm that selects simultaneously the most statistical and semantic informative features through a weighting model. The clustering process selects relevant features and performs document clustering iteratively until stability. The experimental results demonstrate the practical aspects of our algorithm and show that it generates more efficient clustering than the one obtained by other existing algorithms.
2014 International Conference on Next Generation Networks and Services (NGNS) | 2014
Rabab Chakhmoune; Hicham Behja; Brahim Ouhbi; Youssef Benghabrit
This article discusses the problem of knowledge capitalization that require the operations of evaluation. There have been many approaches to develop specialized procedures and techniques, aimed at assuring the highest level of knowledge quality. The method proposed in this paper is based on a corporate memory building. This memory can capitalize just knowledge evaluated using a different criteria. This method is profound enough to be able to evaluate knowledge to be preserved and that crucial knowledge reduce substantially the capitalization cost since they reduce the number of knowledge to process. Hence, update and transfer operations are much simplified.
2011 3rd International Conference on Next Generation Networks and Services (NGNS) | 2011
El Moukhtar Zemmouri; Hicham Behja; Abdelaziz Marzak
Knowledge Discovery in Databases (KDD) is a highly complex, iterative and interactive process involving several types of knowledge and expertise. In this paper we propose to support users of a multi-view analysis (a KDD process held by several experts with different viewpoints). Our objective is to enhance both reusability of the process and coordination between experts. To do so, we propose a formalization of viewpoint in KDD based on CRISP-DM standard and taking into account the domain knowledge involved during a KDD process.