Moustafa Ghanem | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Moustafa Ghanem is active.

Explore More

Publication

Featured researches published by Moustafa Ghanem.

cluster computing and the grid | 2012

Lightweight Resource Scaling for Cloud Applications

Rui Han; Li Guo; Moustafa Ghanem; Yike Guo

Elastic resource provisioning is a key feature of cloud computing, allowing users to scale up or down resource allocation for their applications at run-time. To date, most practical approaches to managing elasticity are based on allocation/de-allocation of the virtual machine (VM) instances to the application. This VM-level elasticity typically incurs both considerable overhead and extra costs, especially for applications with rapidly fluctuating demands. In this paper, we propose a lightweight approach to enable cost-effective elasticity for cloud applications. Our approach operates fine-grained scaling at the resource level itself (CPUs, memory, I/O, etc) in addition to VM-level scaling. We also present the design and implementation of an intelligent platform for light-weight resource management of cloud applications. We describe our algorithms for light-weight scaling and VM-level scaling and show their interaction. We then use an industry standard benchmark to evaluate the effectiveness of our approach and compare its performance against traditional approaches.

Sensors | 2008

Air Pollution Monitoring and Mining Based on Sensor Grid in London

Yajie Ma; Mark Richards; Moustafa Ghanem; Yike Guo; John Hassard

In this paper, we present a distributed infrastructure based on wireless sensors network and Grid computing technology for air pollution monitoring and mining, which aims to develop low-cost and ubiquitous sensor networks to collect real-time, large scale and comprehensive environmental data from road traffic emissions for air pollution monitoring in urban environment. The main informatics challenges in respect to constructing the high-throughput sensor Grid are discussed in this paper. We present a two-layer network framework, a P2P e-Science Grid architecture, and the distributed data mining algorithm as the solutions to address the challenges. We simulated the system in TinyOS to examine the operation of each sensor as well as the networking performance. We also present the distributed data mining result to examine the effectiveness of the algorithm.

conference on information and knowledge management | 2005

A novel refinement approach for text categorization

Songbo Tan; Xueqi Cheng; Moustafa Ghanem; Bin Wang; Hongbo Xu

In this paper we present a novel strategy, DragPushing, for improving the performance of text classifiers. The strategy is generic and takes advantage of training errors to successively refine the classification model of a base classifier. We describe how it is applied to generate two new classification algorithms; a Refined Centroid Classifier and a Refined Naïve Bayes Classifier. We present an extensive experimental evaluation of both algorithms on three English collections and one Chinese corpus. The results indicate that in each case, the refined classifiers achieve significant performance improvement over the base classifiers used. Furthermore, the performance of the Refined Centroid Classifier implemented is comparable, if not better, to that of state-of-the-art support vector machine (SVM)-based classifier, but offers a much lower computational cost.

BMC Bioinformatics | 2012

Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support

Mohamed Abouelhoda; Shadi Alaa Issa; Moustafa Ghanem

BackgroundOver the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.ResultsIn this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.ConclusionsTavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis.The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the users local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.

cairo international biomedical engineering conference | 2008

Scientific workflow systems - can one size fit all?

Vasa Curcin; Moustafa Ghanem

The past decade has witnessed a growing trend in designing and using workflow systems with a focus on supporting the scientific research process in bioinformatics and other areas of life sciences. The aim of these systems is mainly to simplify access, control and orchestration of remote distributed scientific data sets using remote computational resources, such as EBI web services. In this paper we present the state of the art in the field by reviewing six such systems: Discovery Net, Taverna, Triana, Kepler, Yawl and BPEL. We provide a high-level framework for comparing the systems based on their control flow and data flow properties with a view of both informing future research in the area by academic researchers and facilitating the selection of the most appropriate system for a specific application task by practitioners.

knowledge discovery and data mining | 2002

Discovery net: towards a grid of knowledge discovery

Vasa Curcin; Moustafa Ghanem; Yike Guo; Martin Köhler; Anthony Rowe; Jameel Syed; Patrick Wendel

This paper provides a blueprint for constructing collaborative and distributed knowledge discovery systems within Grid-based computing environments. The need for such systems is driven by the quest for sharing knowledge, information and computing resources within the boundaries of single large distributed organisations or within complex Virtual Organisations (VO) created to tackle specific projects. The proposed architecture is built on top of a resource federation management layer and is composed of a set of different resources. We show how this architecture will behave during a typical KDD process design and deployment, how it enables the execution of complex and distributed data mining tasks with high performance and how it provides a community of e-scientists with means to collaborate, retrieve and reuse both KDD algorithms, discovery processes and knowledge in a visual analytical environment.

ieee international conference on high performance computing data and analytics | 2003

The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery

Salman AlSairafi; Filippia-Sofia Emmanouil; Moustafa Ghanem; Nikolaos Giannadakis; Yike Guo; Dimitrios Kalaitzopoulos; Michelle Osmond; Anthony Rowe; Jameel Syed; Patrick Wendel

With the emergence of distributed resources and grid technologies there is a need to provide higher level informatics infrastructures allowing scientists to easily create and execute meaningful data integration and analysis processes that take advantage of the distributed nature of the available resources. These resources typically include heterogeneous data sources, computational resources for task execution and various application-specific services. The effort of the high performance community has so far mainly focused on the delivery of low-level informatics infrastructures enabling the basic needs of grid applications. Such infrastructures are essential but do not directly help end-users in creating generic and re-usable applications. In this paper, we present the Discovery Net architecture for building grid-based knowledge discovery applications. Our architecture enables the creation of high-level, re-usable and distributed application workflows that use a variety of common types of distributed resources. It is built on top of standard protocols and standard infrastructures such as Globus but also defines its own protocols such as the Discovery Process Mark-up Language for data flow management. We discuss an implementation of our architecture and evaluate it by building a real-time genome annotation environment on top.

IEEE Sensors Journal | 2011

Distributed Clustering-Based Aggregation Algorithm for Spatial Correlated Sensor Networks

Yajie Ma; Yike Guo; Xiangchuan Tian; Moustafa Ghanem

In wireless sensor networks, it is already noted that nearby sensor nodes monitoring an environmental feature typically register similar values. This kind of data redundancy due to the spatial correlation between sensor observations inspires the research of in-network data aggregation. In this paper, an α -local spatial clustering algorithm for sensor networks is proposed. By measuring the spatial correlation between data sampled by different sensors, the algorithm constructs a dominating set as the sensor network backbone used to realize the data aggregation based on the information description/summarization performance of the dominators. In order to evaluate the performance of the algorithm a pattern recognition scenario over environmental data is presented. The evaluation shows that the resulting network achieved by our algorithm can provide environmental information at higher accuracy compared to other algorithms.

advanced information networking and applications | 2012

Elastic Application Container: A Lightweight Approach for Cloud Resource Provisioning

Sijin He; Li Guo; Yike Guo; Chao Wu; Moustafa Ghanem; Rui Han

Virtual machine (VM) based virtual infrastructure has been adopted widely in cloud computing environment for elastic resource provisioning. Performing resource management using VMs, however, is a heavyweight task. In practice, we have identified two scenarios where VM based resource management is less feasible and less resource-efficient. In this paper, we propose a lightweight resource management model that is called Elastic Application Container (EAC). EAC is a virtual resource unit for delivering better resource efficiency and more scalable cloud applications. We describe the EAC system architecture and components, and also present an algorithm for EAC resource provisioning. We also describe an implementation of the EAC-oriented platform to support multi-tenant cloud use. To evaluate our approach and implementation, we conducted experiments and collected performance data by comparing VM-based and EAC-based resource management with regards to their feasibility and resource-efficiency. The experiment results show that our proposed EAC-based resource management approach outperforms the VM-based approach in terms of feasibility and resource-efficiency.

international conference on cloud computing | 2012

Improving Resource Utilisation in the Cloud Environment Using Multivariate Probabilistic Models

Sijin He; Li Guo; Moustafa Ghanem; Yike Guo

Resource provisioning based on virtual machine (VM) has been widely accepted and adopted in cloud computing environments. A key problem resulting from using static scheduling approaches for allocating VMs on different physical machines (PMs) is that resources tend to be not fully utilised. Although some existing cloud reconfiguration algorithms have been developed to address the problem, they normally result in high migration costs and low resource utilisation due to ignoring the multi-dimensional characteristics of VMs and PMs. In this paper we present and evaluate a new algorithm for improving resource utilisation for cloud providers. By using a multivariate probabilistic model, our algorithm selects suitable PMs for VM re-allocation which are then used to generate a reconfiguration plan. We also describe two heuristics metrics which can be used in the algorithm to capture the multi-dimensional characteristics of VMs and PMs. By combining these two heuristics metrics in our experiments, we observed that our approach improves the resource utilisation level by around 8% for cloud providers, such as IC Cloud, which accept user-defined VM configurations and 14% for providers, such as Amazon EC2, which only provide limited types of VM configurations.

Explore More