Mohand-Said Mezmaz
University of Mons
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohand-Said Mezmaz.
Journal of Parallel and Distributed Computing | 2011
Mohand-Said Mezmaz; Nouredine Melab; Yacine Kessaci; Young Choon Lee; El-Ghazali Talbi; Albert Y. Zomaya; Daniel Tuyttens
In this paper, we investigate the problem of scheduling precedence-constrained parallel applications on heterogeneous computing systems (HCSs) like cloud computing infrastructures. This kind of application was studied and used in many research works. Most of these works propose algorithms to minimize the completion time (makespan) without paying much attention to energy consumption. We propose a new parallel bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption. We particularly focus on the island parallel model and the multi-start parallel model. Our new method is based on dynamic voltage scaling (DVS) to minimize energy consumption. In terms of energy consumption, the obtained results show that our approach outperforms previous scheduling methods by a significant margin. In terms of completion time, the obtained schedules are also shorter than those of other algorithms. Furthermore, our study demonstrates the potential of DVS.
international parallel and distributed processing symposium | 2007
Mohand-Said Mezmaz; Nouredine Melab; El-Ghazali Talbi
Solving optimally large instances of combinatorial optimization problems requires a huge amount of computational resources. In this paper, we propose an adaptation of the parallel branch and bound algorithm for computational grids. Such gridification is based on new ways to efficiently deal with some crucial issues, mainly dynamic adaptive load balancing, fault tolerance, global information sharing and termination detection of the algorithm. A new efficient coding of the work units (search sub-trees) distributed during the exploration of the search tree is proposed to optimize the involved communications. The algorithm has been implemented following a large scale idle time stealing paradigm (Farmer-Worker). It has been experimented on a flow-shop problem instance (Ta056) that has never been optimally solved. The new algorithm allowed to realize a success story as the optimal solution has been found with proof of optimality, within 25 days using about 1900 processors belonging to 9 Nation-wide distinct clusters (administration domains). During the resolution, the worker processors were exploited with an average of 97% while the farmer processor was exploited only 1.7% of the time. These two rates are good indicators on the efficiency of the proposed approach and its scalability.
Concurrency and Computation: Practice and Experience | 2013
Imen Chakroun; Mohand-Said Mezmaz; Nouredine Melab; Ahcène Bendjoudi
In this paper, we address the design and implementation of graphical processing unit (GPU)‐accelerated branch‐and‐bound algorithms (B&B) for solving flow‐shop scheduling optimization problems (FSP). Such applications are CPU‐time consuming and highly irregular. On the other hand, GPUs are massively multithreaded accelerators using the single instruction multiple data model at execution. A major issue that arises when executing on GPU, a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP that contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well‐known FSP benchmarks using an Nvidia Tesla (C2050 GPU card (http://www.nvidia.com/docs/IO/43395/NV_DS_Tesla_C2050_C2070_jul10_lores.pdf)). Compared with a CPU‐based execution, accelerations up to × 77.46 are achieved for large problem instances. Copyright
parallel computing | 2006
Nouredine Melab; Mohand-Said Mezmaz; El-Ghazali Talbi
In this paper, we contribute with the first results on parallel cooperative multi-objective meta-heuristics on computational grids. We particularly focus on the island model and the multi-start model and their cooperation. We propose a checkpointing-based approach to deal with the fault tolerance issue of the island model. Nowadays, existing Dispatcher-Worker grid middlewares are inadequate for the deployment of parallel cooperative applications. Indeed, these need to be extended with a software layer to support the cooperation. Therefore, we propose a Linda-like cooperation model and its implementation on top of Xtrem Web. This middleware is then used to develop a parallel meta-heuristic applied to a bi-objective Flow-Shop problem using the two models. The work has been experimented on a multidomain education network of 321 heterogeneous Linux PCs. The preliminary results, obtained after more than 10 days, demonstrate that the use of grid computing allows to fully exploit effectively different parallel models and their combination for solving large-size problem instances. An improvement of the effectiveness by over 60% is realized compared to serial meta-heuristic.
international conference on cluster computing | 2012
Nouredine Melab; Imen Chakroun; Mohand-Said Mezmaz; Daniel Tuyttens
Branch-and-Bound (B&B) algorithms are time-intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed up those methods. The focus is put on the bounding mechanism of B&B algorithms, which is the most time consuming part of their exploration process. We propose a parallel B&B algorithm based on a GPU-accelerated bounding model. The proposed approach concentrate on optimizing data access management to further improve the performance of the bounding mechanism which uses large and intermediate data sets that do not completely fit in GPU memory. Extensive experiments of the contribution have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained performances to a single and a multithreaded CPU-based execution. Accelerations up to X100 are achieved for large problem instances.
parallel, distributed and network-based processing | 2007
Mohand-Said Mezmaz; Nouredine Melab; El-Ghazali Talbi
The branch and bound (B&B) algorithm is one of the most used methods to solve in an exact way combinatorial optimization problems. This article focuses on the multi-objective version of this algorithm, and proposes a new parallel approach adapted to grid computing systems. This approach addresses several issues related to the characteristics of the algorithm itself and the properties of grid computing systems. Validation is performed by experimenting the approach on a bi-objective flow-shop problem instance that has never been solved exactly. Solving this instance, after several days of computation on a grid of more than 1000 processors, belonging to 7 distinct clusters, the obtained results prove the efficiency of the proposed approach
Journal of Parallel and Distributed Computing | 2013
Imen Chakroun; Nordine Melab; Mohand-Said Mezmaz; Daniel Tuyttens
In this paper, we revisit the design and implementation of Branch-and-Bound (B&B) algorithms for solving large combinatorial optimization problems on GPU-enhanced multi-core machines. B&B is a tree-based optimization method that uses four operators (selection, branching, bounding and pruning) to build and explore a highly irregular tree representing the solution space. In our previous works, we have proposed a GPU-accelerated approach in which only a single CPU core is used and only the bounding operator is performed on the GPU device. Here, we extend the approach (LL-GB&B) in order to minimize the CPU-GPU communication latency and thread divergence. Such an objective is achieved through a GPU-based fine-grained parallelization of the branching and pruning operators in addition to the bounding one. The second contribution consists in investigating the combination of a GPU with multi-core processing. Two scenarios have been explored leading to two approaches: a concurrent (RLL-GB&B) and a cooperative one (PLL-GB&B). In the first one, the exploration process is performed concurrently by the GPU and the CPU cores. In the cooperative approach, the CPU cores prepare and off-load to GPU pools of tree nodes using data streaming while the GPU performs the exploration. The different approaches have been extensively experimented on the Flowshop scheduling problem. Compared to a single CPU-based execution, LL-GB&B allows accelerations up to (x160) for large problem instances. Moreover, when combining multi-core and GPU, we figure out that using RLL-GB&B is not beneficial while PLL-GB&B enables an improvement up to 36% compared to LL-GB&B.
congress on evolutionary computation | 2010
Mohand-Said Mezmaz; Young Choon Lee; Nouredine Melab; El-Ghazali Talbi; Albert Y. Zomaya
Precedence-constrained parallel applications are one of the most typical application model used in scientific and engineering fields. Almost all efforts, on this kind of applications, have focused on the minimization of makespan (completion time). It is only recently that much attention has been paid to energy consumption. In this paper, we address the precedence-constrained parallel applications on heterogeneous computing systems (HCSs). We propose a new bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption. This metaheuristic adopts dynamic voltage scaling (DVS) to minimize energy consumption. Our study provides promising results showing the significance and potential of DVS. The experimental results from our comparative evaluation study confirm the superior performance of our approach over the other known heuristics on the two criteria energy saving and completion time.
international conference on e science | 2006
Mohand-Said Mezmaz; Nouredine Melab; El-Ghazali Talbi
The focus of this paper is on the parallel multi-start and island models of meta-heuristics within the context of multiobjective optimization on the computational grid. The combination of these two models often provides very effective parallel algorithms. However, experiments on large-size problem instances are often stopped before the convergence of these algorithms is achieved. The full exploitation of the cooperation needs a large amount of computational resources and the management of the fault tolerance issue. In this paper, we propose a grid-based fault-tolerant approach for these models and their implementation on the XtremWeb grid middleware. The approach has been experimented on the bi-objective Flow-Shop problem on a computational grid which is a multi-domain education network composed of 321 heterogeneous Linux PCs. The preliminary results, obtained after an execution time of several days, demonstrate that the use of grid computing allows to fully exploit effectively and efficiently the two parallel models and their combination for solving challenging optimization problems. An improvement of the effectiveness by over 60% compared to a serial meta-heuristic is obtained with a computational grid.
international parallel and distributed processing symposium | 2005
Nouredine Melab; Mohand-Said Mezmaz; El-Ghazali Talbi
Solving large size and time-intensive combinatorial optimization problems with parallel hybrid multi-objective evolutionary algorithms (MO-EAs) requires a large amount of computational resources. Peer-to-peer (P2P) computing is recently revealed as a powerful way to harness these resources and efficiently deal with such problems. In this paper, we focus on the parallel hybrid multi-objective island model for P2P systems. We address its design, implementation, and fault-tolerant deployment in a P2P context. The proposed model has been experimented on the Bi-criterion permutation flow-shop problem (BPFSP) on a network of 120 heterogeneous PCs. The preliminary results demonstrate the effectiveness of this model and its capabilities to fully exploit the hybridization.