Is this you? Create Your Porfile

Asma Belhadi

University of Science and Technology, Sana'a

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Asma Belhadi is active.

Explore More

Publication

Featured researches published by Asma Belhadi.

Knowledge Based Systems | 2018

Extracting useful knowledge from event logs: A frequent itemset mining approach

Youcef Djenouri; Asma Belhadi; Philippe Fournier-Viger

Abstract Business process analysis is a key activity that aims at increasing the efficiency of business operations. In recent years, several data mining based methods have been designed for discovering interesting patterns in event logs. A popular type of methods consists of applying frequent itemset mining to extract patterns indicating how resources and activities are frequently used. Although these methods are useful, they have two important limitations. First, these methods are designed to be applied to original event logs. Because these methods do not consider other perspectives on the data that could be obtained by applying data transformations, many patterns are missed that may represent important information for businesses. Second, these methods can generate a large number of patterns since they only consider the minimum support as constraint to select patterns. But analyzing a large number of patterns is time-consuming for users, and many irrelevant patterns may be found. To address these issues, this paper presents an improved event log analysis approach named AllMining. It includes a novel pre-processing method to construct multiple types of transaction databases from a same original event log using transformations. This allows to extract many new useful types of patterns from event logs with frequent itemset mining techniques. To address the second issue, a pruning strategy is further developed based on a novel concept of pattern coverage, to present a small set of patterns that covers many events to decision makers. Results of experiments on real-life event logs show that the proposed approach is promising compared to existing frequent itemset mining approaches and state-of-the-art process model algorithms.

Expert Systems With Applications | 2018

Bees swarm optimization guided by data mining techniques for document information retrieval

Youcef Djenouri; Asma Belhadi; Riadh Belkebir

Abstract This paper explores advances in the data mining field to solve the fundamental Document Information Retrieval problem. In the proposed approach, useful knowledge is first discovered by using data mining techniques, then swarms use this knowledge to explore the whole space of documents intelligently. We have investigated two data mining techniques in the preprocessing step. The first one aims to split the collection of documents into similar clusters by using the K-means algorithm, while the second one extracts the most closed frequent terms on each cluster already created using the DCI_Closed algorithm. For the solving step, BSO (Bees Swarm Optimization) is used to explore the cluster of documents deeply. The proposed approach has been evaluated on well-known collections such as CACM (Collection of ACM), TREC (Text REtrieval Conference), Webdocs, and Wikilinks, and it has been compared to state-of-the-art data mining, bio-inspired and other documents information retrieval based approaches. The results show that the proposed approach improves the quality of returned documents considerably, with a competitive computational time compared to state-of-the-art approaches.

Information Sciences | 2018

Fast and effective cluster-based information retrieval using frequent closed itemsets

Youcef Djenouri; Asma Belhadi; Philippe Fournier-Viger; Jerry Chun-Wei Lin

Abstract Document Information retrieval consists of finding the documents in a collection of documents that are the most relevant to a user query. Information retrieval techniques are widely-used by organizations to facilitate the search for information. However, applying traditional information retrieval techniques is time consuming for large document collections. Recently, cluster-based information retrieval approaches have been developed. Although these approaches are often much faster than traditional approaches for processing large document collections, the quality of the documents retrieved by cluster-based approaches is often less than that of traditional approaches. To address this drawback of cluster-based approaches, and improve the performance of information retrieval both in terms of runtime and quality of retrieved documents, this paper proposes a new cluster-based information retrieval approach named ICIR (Intelligent Cluster-based Information Retrieval). The proposed approach combines k-means clustering with frequent closed itemset mining to extract clusters of documents and find frequent terms in each cluster. Patterns discovered in each cluster are then used to select the most relevant document clusters to answer each user query. Four alternative heuristics are proposed to select the most relevant clusters, and two alternative heuristics for choosing documents in the selected clusters. Thus, eight versions of the proposed approach are obtained. To validate the proposed approach, extensive experiments have been carried out on well-known document collections. Results show that the designed approach outperforms traditional and cluster-based information retrieval approaches both in terms of execution time and quality of the returned documents.

Distributed and Parallel Databases | 2018

How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem

Youcef Djenouri; Djamel Djenouri; Zineb Habbas; Asma Belhadi

The application of population-based metaheuristics approaches to the association rules mining problem is explored in this paper. The combination of GPU and cluster-based parallel computing techniques is investigated for the purpose of accelerating the process of extracting the correlations between items in sizeable data instances. We propose four parallel-based approaches that benefit from the cluster intensive computing in the generation process and the massively GPU threading. This is by evaluating the association rules in parallel on GPU. To validate the proposed approaches, the most used population-based metaheuristics (GA, PSO, and BSO) have been executed on a cluster of GPUs to solve benchmarks of large and big ARM instances. We used Intel Xeon 64bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that the BSO outperforms GA and PSO. They also show that the proposed solution outperforms the HPC-based ARM approaches when exploring Webdocs instance (the largest instance existing on the web). To our knowledge, this is the first work that explores the combination of GPU and cluster-based parallel computing with the population-based metaheuristics in association rule mining.

Information Sciences | 2018

Mining diversified association rules in big datasets: A cluster/GPU/genetic approach

Youcef Djenouri; Asma Belhadi; Philippe Fournier-Viger; Hamido Fujita

Abstract Association rule mining is a popular data mining task, which has important in many domains. Because the task of association rule mining is very time consuming, evolutionary and swarm based algorithms have been designed to find approximate solutions. However, these approaches still have long execution times, especially when applied on dense and big databases, or when low minsup and minconf threshold values are used. Moreover, these approaches suffer from the lack of diversity in the rules presented to the user. To address these drawbacks of previous algorithms, this paper proposes an efficient parallel algorithm named CGPUGA. It is a genetic algorithm that runs on clusters of GPUs to efficiently discover diversified association rules. It benefits from cluster computing to generate rules. Then, to evaluate rules, which is the most time consuming task, the designed algorithm relies on the massively parallel GPU threads. Furthermore, to deal with the issue of rule quality, the search space of rules is partitioned into several regions assigned to different workers, and rules found by each workers are the merged to ensure diversification. The designed approach has been empirically compared with state-of-the-art algorithms using small, medium, large and big datasets. Results reveal that CGPUGA is 600 times faster than the sequential version of the algorithm for big datasets. Moreover, it outperforms state-of-the-art high performance computing based association rule mining algorithms for real big datasets such as Pokec, Webdocs and Wikilinks. In terms of rule quality, results show that the designed CGPUGA algorithm provides rules of higher quality compared to the state-of-the-art NIGGAR, MSP-MPSO and MPGA algorithms for diversified association rule mining.

international conference on big data | 2018

GBSO-RSS: GPU-Based BSO for Rules Space Summarization

Youcef Djenouri; Jerry Chun-Wei Lin; Djamel Djenouri; Asma Belhadi; Philippe Fournier-Viger

In this paper, we present a novel GBSO-RSS algorithm to deal with exploration and mining of association rules in big data, with the big challenge of increasing computation time. The GBSO-RSS algorithm is based on meta-rules discovery that gives to the user the summary of the rules’ space through a meta-rules representation. This allows the user to decide about the rules to take and prune. We also adapt a pruning strategy of our previous work to keep only the representatives rules. As the meta-rules space is much larger than the rules space, a new GPU-based approach called GBSO-RSS approach is proposed for efficient exploitation. The proposed approach has been compared on big database instances, and the results illustrate the acceleration on the summarization process. Further experimentation reveals the superiority of GBSO-RSS compared to Berrado approach in terms of number of satisfied association rules.

Applied Intelligence | 2018

A new framework for metaheuristic-based frequent itemset mining

Youcef Djenouri; Djamel Djenouri; Asma Belhadi; Philippe Fournier-Viger; Jerry Chun-Wei Lin

This paper proposes a novel framework for metaheuristic-based Frequent Itemset Mining (FIM), which considers intrinsic features of the FIM problem. The framework, called META-GD, can be used to steer any metaheuristics-based FIM approach. Without loss of generality, three metaheuristics are considered in this paper, namely the genetic algorithm (GA), particle swarm optimization (PSO), and bee swarm optimization (BSO). This allows to derive three approaches, named GA-GD, PSO-GD, and BSO-GD, respectively. An extensive experimental evaluation on medium and large database instances reveal that PSO-GD outperforms state-of-the-art metaheuristic-based approaches in terms of runtime and solution quality.

Information Sciences | 2018