Sherif Sakr
King Saud bin Abdulaziz University for Health Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sherif Sakr.
IEEE Communications Surveys and Tutorials | 2011
Sherif Sakr; Anna Liu; Daniel M. Batista; Mohammad Alomari
In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data. Moreover, the recent advances in Web technology has made it easy for any user to provide and consume content of any form. This has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. Cloud computing is associated with a new paradigm for the provision of computing infrastructure. This paradigm shifts the location of this infrastructure to the network to reduce the costs associated with the management of hardware and software resources. This paper gives a comprehensive survey of numerous approaches and mechanisms of deploying data-intensive applications in the cloud which are gaining a lot of momentum in both research and industrial communities. We analyze the various design decisions of each approach and its suitability to support certain classes of applications and end-users. A discussion of some open issues and future challenges pertaining to scalability, consistency, economical processing of large scale data on the cloud is provided. We highlight the characteristics of the best candidate classes of applications that can be deployed in the cloud.
very large data bases | 2004
Torsten Grust; Sherif Sakr; Jens Teubner
Relational database systems may be turned into efficient XML and XPath processors if the system is provided with a suitable relational tree encoding. This paper extends this relational XML processing stack and shows that an RDBMS can also serve as a highly efficient XQuery runtime environment. Our approach is purely relational: XQuery expressions are compiled into SQL code which operates on the tree encoding. The core of the compilation procedure trades XQuerys notions of variable scopes and nested iteration (FLWOR blocks) for equi-joins. The resulting relational XQuery processor closely adheres to the language semantics, e.g., it obeys node identity as well as document and sequence order, and can support XQuerys full axis feature. The system exhibits quite promising performance figures in experiments. Somewhat unexpectedly, we will also see that the XQuery compiler can make good use of SQLs OLAP functionality.
ACM Computing Surveys | 2013
Sherif Sakr; Anna Liu; Ayman G. Fayoumi
In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large-scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling, and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large-scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large-scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.
international conference on management of data | 2010
Sherif Sakr; Ghazi Al-Naymat
The Resource Description Framework (RDF) is a flexible model for representing information about resources in the web. With the increasing amount of RDF data which is becoming available, efficient and scalable management of RDF data has become a fundamental challenge to achieve the SemanticWeb vision. The RDF model has attracted the attention of the database community and many researchers have proposed different solutions to store and query RDF data efficiently. This survey focuses on using relational query processors to store and query RDF data. We provide an overview of the different approaches and classify them according to their storage and query evaluation strategies.
Journal of Computer and System Sciences | 2009
Sherif Sakr
XML has been acknowledged as the defacto standard for data representation and exchange over the World Wide Web. Being self describing grants XML its great flexibility and wide acceptance but on the other hand it is the cause of its main drawback that of being huge in size. The huge document size means that the amount of information that has to be transmitted, processed, stored, and queried is often larger than that of other data formats. Several XML compression techniques has been introduced to deal with these problems. In this paper, we provide a complete survey over the state-of-the-art of XML compression techniques. In addition, we present an extensive experimental study of the available implementations of these techniques. We report the behavior of nine XML compressors using a large corpus of XML documents which covers the different natures and scales of XML documents. In addition to assessing and comparing the performance characteristics of the evaluated XML compression tools, the study also tries to assess the effectiveness and practicality of using these tools in the real world. Finally, we provide some guidelines and recommendations which are useful for helping developers and users for making an effective decision towards selecting the most suitable XML compression tool for their needs.
international world wide web conferences | 2010
Sherif Sakr; Ahmed Awad
We present a framework for querying and reusing graph-based business process models. The framework is based on a new visual query language for business processes called BPMN-Q. The language addresses processes definitions and extends the standard BPMN visual notations for modeling business processes for its concrete syntax. BPMN-Q is used to query process models by matching a process model graph to a query graph. Moreover, the reusing framework is enhanced with a semantic query expander component. This component provides the users with the flexibility to get not only the perfectly matched process models to their queries but also the models with high similarity. The query engine of the framework is built on top of traditional RDBMS. A novel decomposition based and selectivity-aware relational processing mechanism is employed to achieve an efficient and scalable performance for graph-based BPMN-Q queries.
Journal of Internet Services and Applications | 2012
Basem Suleiman; Sherif Sakr; D. Ross Jeffery; Anna Liu
The exposure of business applications to the web has considerably increased the variability of its workload patterns and volumes as the number of users/customers often grows and shrinks at various rates and times. Such application characteristics have increasingly demanded the need for flexible yet inexpensive computing infrastructure to accommodate variable workloads. The on-demand and per-use cloud computing model, specifically that of public Cloud Infrastructure Service Offerings (CISOs), has quickly evolved and adopted by majority of hardware and software computing companies with the promise of provisioning utility-like computing resources at massive economies of scale. However, deploying business applications on public cloud infrastructure does not lead to achieving desired economics and elasticity gains, and some challenges block the way for realizing its real benefits. These challenges are due to multiple differences between CISOs and application’s requirements and characteristics. This article introduces a detailed analysis and discussion of the economics and elasticity challenges of business applications to be deployed and operate on public cloud infrastructure. This includes analysis of various aspects of public CISOs, modeling and measuring CISOs’ economics and elasticity, application workload patterns and its impact on achieving elasticity and economics, economics-driven elasticity decisions and policies, and SLA-driven monitoring and elasticity of cloud-based business applications. The analysis and discussion are supported with motivating scenarios for cloud-based business applications. The paper provides a multi-lenses overview that can help cloud consumers and potential business application’s owners to understand, analyze, and evaluate important economics and elasticity capabilities of different CISOs and its suitability for meeting their business application’s requirements.
conference on information and knowledge management | 2012
Sherif Sakr; Sameh Elnikety; Yuxiong He
We propose a SPARQL-like language, G-SPARQL, for querying attributed graphs. The language expresses types of queries which of large interest for applications which model their data as large graphs such as: pattern matching, reachability and shortest path queries. Each query can combine both of structural predicates and value-based predicates (on the attributes of the graph nodes and edges). We describe an algebraic compilation mechanism for our proposed query language which is extended from the relational algebra and based on the basic construct of building SPARQL queries, the Triple Pattern. We describe a hybrid Memory/Disk representation of large attributed graphs where only the topology of the graph is maintained in memory while the data of the graph is stored in a relational database. The execution engine of our proposed query language splits parts of the query plan to be pushed inside the relational database while the execution of other parts of the query plan are processed using memory-based algorithms, as necessary. Experimental results on real datasets demonstrate the efficiency and the scalability of our approach and show that our approach outperforms native graph databases by several factors.
international conference on cloud computing | 2012
Sherif Sakr; Anna Liu
One of the main advantages of the cloud computing paradigm is that it simplifies the time-consuming processes of hardware provisioning, hardware purchasing and software deployment. Currently, we are witnessing a proliferation in the number of cloud-hosted applications with a tremendous increase in the scale of the data generated as well as being consumed by such applications. Cloud-hosted database systems powering these applications form a critical component in the software stack of these applications. Service Level Agreements (SLA) represent the contract which captures the agreed upon guarantees between a service provider and its customers. The specifications of existing service level agreement (SLA) for cloud services are not designed for flexibly handling even relatively straightforward performance and technical requirements of consumer applications. The concerns of consumers for cloud services regarding the SLA management of their hosted applications within the cloud environments will gain increasing importance as cloud computing becomes more pervasive. This paper introduces the notion, challenges and the importance of SLA-based provisioning and cost management for cloud-hosted databases from the consumer perspective. We present an end-to-end framework that acts as a middleware which resides between the consumer applications and the cloud-hosted databases. The aim of the framework is to facilitate adaptive and dynamic provisioning of the database tier of the software applications based on application-defined policies for satisfying their own SLA performance requirements, avoiding the cost of any SLA violation and controlling the monetary cost of the allocated computing resources. The experimental results demonstrate that SLA-based provisioning is more adequate for providing consumer applications the required flexibility in achieving their goals.
business process management | 2011
Seyed-Mehdi-Reza Beheshti; Boualem Benatallah; Hamid Reza Motahari-Nezhad; Sherif Sakr
The execution of a business process (BP) in todays enterprises may involve a workflow and multiple IT systems and services. Often no complete, up-to-date documentation of the model or correlation information of process events exist. Understanding the execution of a BP in terms of its scope and details is challenging specially as it is subjective: depends on the perspective of the person looking at BP execution. We present a framework, simple abstractions and a language for the explorative querying and understanding of BP execution from various user perspectives. We propose a query language for analyzing event logs of process-related systems based on the two concepts of folders and paths, which enable an analyst to group related events in the logs or find paths among events. Folders and paths can be stored to be used in follow-on analysis. We have implemented the proposed techniques and the language, FPSPARQL, by extending SPARQL graph query language. We present the evaluation results on the performance and the quality of the results using a number of process event logs.
Collaboration
Dive into the Sherif Sakr's collaboration.
Commonwealth Scientific and Industrial Research Organisation
View shared research outputs