Omar Batarfi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Omar Batarfi is active.

Explore More

Publication

Featured researches published by Omar Batarfi.

Cluster Computing | 2015

Large scale graph processing systems: survey and an experimental evaluation

Omar Batarfi; Radwa El Shawi; Ayman G. Fayoumi; Reza Nouri; Seyed-Mehdi-Reza Beheshti; Ahmed Barnawi; Sherif Sakr

Graph is a fundamental data structure that captures relationships between different data entities. In practice, graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. In principle, graph analytics is an important big data discovery technique. Therefore, with the increasing abundance of large graphs, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In general, scalable processing of big graphs is a challenging task due to their size and the inherent irregular structure of graph computations. Thus, in recent years, we have witnessed an unprecedented interest in building big graph processing systems that attempted to tackle these challenges. In this article, we provide a comprehensive survey over the state-of-the-art of large scale graph processing platforms. In addition, we present an extensive experimental study of five popular systems in this domain, namely, GraphChi, Apache Giraph, GPS, GraphLab and GraphX. In particular, we report and analyze the performance characteristics of these systems using five common graph processing algorithms and seven large graph datasets. Finally, we identify a set of the current open research challenges and discuss some promising directions for future research in the domain of large scale graph processing.

grid computing | 2016

Big Data 2.0 Processing Systems: Taxonomy and Open Challenges

Fuad Bajaber; Radwa Elshawi; Omar Batarfi; Abdulrahman H. Altalhi; Ahmed Barnawi; Sherif Sakr

Data is key resource in the modern world. Big data has become a popular term which is used to describe the exponential growth and availability of data. In practice, the growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. For a decade, the MapReduce framework, and its open source realization, Hadoop, has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. However, in recent years, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains and big data processing scenarios such as large scale processing of structured data, graph data and streaming data. Thus, we have witnessed an unprecedented interest to tackle these challenges with new solutions which constituted a new wave of mostly domain-specific, optimized big data processing platforms. In this article, we refer to this new wave of systems as Big Data 2.0 processing systems. To better understand the latest ongoing developments in the world of big data processing systems, we provide a taxonomy and detailed analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

Technology Conference on Performance Evaluation and Benchmarking | 2014

On Characterizing the Performance of Distributed Graph Computation Platforms

Ahmed Barnawi; Omar Batarfi; Seyed-Mehdi-Reza Behteshi; Radwa Elshawi; Ayman G. Fayoumi; Reza Nouri; Sherif Sakr

Graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. Therefore, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In practice, distributed processing of large scale graphs is a challenging task due to their size in addition to their inherent irregular structure and the iterative nature of graph processing and computation algorithms. In recent years, several distributed graph processing systems have been presented, most notably Pregel and GraphLab, to tackle this challenge. In particular, both systems use a vertex-centric computation model which enables the user to design a program that is executed locally for each vertex in parallel. In this paper, we analyze the performance characteristics of distributed graph processing systems and provide an experimental comparison on the performance of two popular systems in this area.

international conference on big data | 2015

Big Graph Processing Systems: State-of-the-Art and Open Challenges

Radwa Elshawi; Omar Batarfi; Ayman G. Fayoumi; Ahmed Barnawi; Sherif Sakr

Graph is a fundamental data structure that captures relationships between different data entities. In practice, graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. In principle, graph analytics is an important big data discovery technique. Therefore, with the increasing abundance of large graphs, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In general, distributed processing of big graphs is a challenging task due to their size and the inherent irregular structure of graph computations. Thus, in recent years, we have witnessed an unprecedented interest in building big graph processing systems that attempted to tackle these challenges. To better understand the challenges of developing scalable graph processing systems, in this paper, we provide a comprehensive overview of the state-of-the art. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

international conference on cloud computing | 2015

Big Data Processing Systems: State-of-the-Art and Open Challenges

Sherif Sakr; Fuad Bajaber; Ahmed Barnawi; Abdulrahman H. Altalhi; Radwa Elshawi; Omar Batarfi

The growing demand for large-scale data processing and data analysis applications spurred the development of novel solutions from both the industry and academia. In the last decade, the MapReduce framework has emerged as a highly successful framework that has created a lot of momentum in both the research and industrial communities such that it has become the defacto standard of big data processing platforms. In particular, the MapReduce framework has been introduced to provide a simple but powerful programming model and runtime environment that eases the job of developing scalable parallel applications to process vast amounts of data on large clusters of commodity machines. However, recently, academia and industry have started to recognize the limitations of the Hadoop framework in several application domains such as large scale processing of structured data, graph data and streaming data. Thus, in recent years, we have witnessed an unprecedented interest to tackle these challenges which constitutes a new wave of domain-specific optimized big data processing platforms. To better understand the latest ongoing developments in the world of big data processing systems, in this paper, we provide a detailed overview and analysis of the state-of-the-art in this domain. In addition, we identify a set of the current open research challenges and discuss some promising directions for future research.

SpringerPlus | 2016

A distributed query execution engine of big attributed graphs

Omar Batarfi; Radwa Elshawi; Ayman G. Fayoumi; Ahmed Barnawi; Sherif Sakr

A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.

Cluster Computing | 2018

Correction To: Large scale graph processing systems: survey and an experimental evaluation

Omar Batarfi; Radwa El Shawi; Ayman G. Fayoumi; Reza Nouri; Seyed-Mehdi-Reza Beheshti; Ahmed Barnawi; Sherif Sakr

The original version of this article unfortunately contained a mistake in the acknowledgement statement.

Archive | 2016

Automated Behavioral Malware Analysis System

Saja Alqurashi; Omar Batarfi

Nowadays, with the spread of internet and network-based services, malware has become a major threat to computers and information systems. Actually, different malware share similar behaviours, also they have different syntactic structures due to the incorporation of obfuscation techniques such as polymorphism, Oligomorphic and meta-morphism. The different structure of same behavioral malware poses a serious problem to signature-based detection techniques. In this paper we propose an automated prevention system based on malware behaviours. Our system has the ability to collect suspicious software from client computers, then to automatically analyses the behaviour of detected malware. Then agent then sends an alarm to all network clients. The results from an implementation of the proposed system show that our approach is effective in analysing detected malware in automated security systems.

International Journal of Advanced Computer Science and Applications | 2016

A Cloud-Based Platform for Democratizing and Socializing the Benchmarking Process

Fuad Bajaber; Amin Shafaat; Omar Batarfi; Radwa Elshawi; Abdulrahman H. Altalhi; Ahmed Barnawi; Sherif Sakr

Performances evaluation, benchmarking and re-producibility represent significant aspects for evaluating the practical impact of scientific research outcomes in the Computer Science field. In spite of all the benefits (e.g., increasing visibility, boosting impact, improving the research quality) which can be obtained from conducting comprehensive and extensive experi-mental evaluations or providing reproducible software artifacts and detailed description of experimental setup, the required effort for achieving these goals remains prohibitive. In this article, we present the design and the implementation details of the Liquid Benchmarking platform as a social and cloud-based platform for democratizing and socializing the software benchmarking processes. Particularly, the platform facilitates the process of sharing the experimental artifacts (computing resources, datasets, software implementations, benchmarking tasks) as services where the end users can easily design, mashup, execute the experiments and visualize the experimental results with zero installation or configuration efforts. Moreover, the social features of the platform enable the users to share and provide feedback on the results of the executed experiments in a form that can guarantee a transparent scientific crediting process. Finally, we present four benchmarking case studies that have been realized via the Liquid Benchmarking platform in the following domains: XML compression techniques, graph indexing and querying techniques, string similarity join algorithms and reverse K nearest neighbors algorithms.

Journal of Information Security | 2016