Qutaibah M. Malluhi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Qutaibah M. Malluhi is active.

Explore More

Publication

Featured researches published by Qutaibah M. Malluhi.

Computing | 2016

A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems

Abdul Hameed; Alireza Khoshkbarforoushha; Rajiv Ranjan; Prem Prakash Jayaraman; Joanna Kolodziej; Pavan Balaji; Sherali Zeadally; Qutaibah M. Malluhi; Nikos Tziritas; Abhinav Vishnu; Samee Ullah Khan; Albert Y. Zomaya

In a cloud computing paradigm, energy efficient allocation of different virtualized ICT resources (servers, storage disks, and networks, and the like) is a complex problem due to the presence of heterogeneous application (e.g., content delivery networks, MapReduce, web applications, and the like) workloads having contentious allocation requirements in terms of ICT resource capacities (e.g., network bandwidth, processing speed, response time, etc.). Several recent papers have tried to address the issue of improving energy efficiency in allocating cloud resources to applications with varying degree of success. However, to the best of our knowledge there is no published literature on this subject that clearly articulates the research problem and provides research taxonomy for succinct classification of existing techniques. Hence, the main aim of this paper is to identify open challenges associated with energy efficient resource allocation. In this regard, the study, first, outlines the problem and existing hardware and software-based techniques available for this purpose. Furthermore, available techniques already presented in the literature are summarized based on the energy-efficient research dimension taxonomy. The advantages and disadvantages of the existing techniques are comprehensively analyzed against the proposed research dimension taxonomy namely: resource adaption policy, objective function, allocation method, allocation operation, and interoperability.

european symposium on research in computer security | 2012

Secure and Efficient Outsourcing of Sequence Comparisons

Marina Blanton; Mikhail J. Atallah; Keith B. Frikken; Qutaibah M. Malluhi

We treat the problem of secure outsourcing of sequence comparisons by a client to remote servers, which given two strings λ and μ of respective lengths n and m, consists of finding a minimum-cost sequence of insertions, deletions, and substitutions (also called an edit script) that transform λ into μ. In our setting a client owns λ and μ and outsources the computation to two servers without revealing to them information about either the input strings or the output sequence. Our solution is non-interactive for the client (who only sends information about the inputs and receives the output) and the client’s work is linear in its input/output. The servers’ performance is O(σmn) computation (which is optimal) and communication, where σ is the alphabet size, and the solution is designed to work when the servers have only O(σ(m + n)) memory. By utilizing garbled circuit evaluation in a novel way, we completely avoid public-key cryptography, which makes our solution particularly efficient.

very large data bases | 2016

LocationSpark: a distributed in-memory data management system for big spatial data

Mingjie Tang; Yongyang Yu; Qutaibah M. Malluhi; Mourad Ouzzani; Walid G. Aref

We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSparks indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically flushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework.

IEEE Transactions on Services Computing | 2016

Skyline Discovery and Composition of Multi-Cloud Mashup Services

F. Zhang; Kai Hwang; Samee Ullah Khan; Qutaibah M. Malluhi

A cloud mashup is composed of multiple services with shared datasets and integrated functionalities. For example, the elastic compute cloud (EC2) provided by Amazon Web Service (AWS), the authentication and authorization services provided by Facebook, and the Map service provided by Google can all be mashed up to deliver real-time, personalized driving route recommendation service. To discover qualified services and compose them with guaranteed quality of service (QoS), we propose an integrated skyline query processing method for building up cloud mashup applications. We use a similarity test to achieve optimal localized skyline. This mashup method scales well with the growing number of cloud sites involved in the mashup applications. Faster skyline selection, reduced composition time, dataset sharing, and resources integration assure the QoS over multiple clouds. We experiment with the quality of Web service (QWS) benchmark over 10,000 Web services along six QoS dimensions. By utilizing block-elimination, data-space partitioning, and service similarity pruning, the skyline process is shortened by three times, when compared with two state-of-the-art methods.

IEEE Computer | 2013

Trust in Cloud Services: Providing More Controls to Clients

Khaled M. Khan; Qutaibah M. Malluhi

Trust is more important than money and will ultimately determine cloud computings success.

BMC Genomics | 2014

Assessment of de novo assemblers for draft genomes: a case study with fungal genomes

Mostafa M. Abbas; Qutaibah M. Malluhi; Ponnuraman Balakrishnan

BackgroundRecently, large bio-projects dealing with the release of different genomes have transpired. Most of these projects use next-generation sequencing platforms. As a consequence, many de novo assembly tools have evolved to assemble the reads generated by these platforms. Each tool has its own inherent advantages and disadvantages, which make the selection of an appropriate tool a challenging task.ResultsWe have evaluated the performance of frequently used de novo assemblers namely ABySS, IDBA-UD, Minia, SOAP, SPAdes, Sparse, and Velvet. These assemblers are assessed based on their output quality during the assembly process conducted over fungal data. We compared the performance of these assemblers by considering both computational as well as quality metrics. By analyzing these performance metrics, the assemblers are ranked and a procedure for choosing the candidate assembler is illustrated.ConclusionsIn this study, we propose an assessment method for the selection of de novo assemblers by considering their computational as well as quality metrics at the draft genome level. We divide the quality metrics into three groups: g1 measures the goodness of the assemblies, g2 measures the problems of the assemblies, and g3 measures the conservation elements in the assemblies. Our results demonstrate that the assemblers ABySS and IDBA-UD exhibit a good performance for the studied data from fungal genomes in terms of running time, memory, and quality. The results suggest that whole genome shotgun sequencing projects should make use of different assemblers by considering their merits.

Distributed and Parallel Databases | 2016

Performance analysis of data intensive cloud systems based on data management and replication: a survey

Saif Ur Rehman Malik; Samee Ullah Khan; Sam J. Ewen; Nikos Tziritas; Joanna Kolodziej; Albert Y. Zomaya; Sajjad Ahmad Madani; Nasro Min-Allah; Lizhe Wang; Cheng Zhong Xu; Qutaibah M. Malluhi; Johnatan E. Pecero; Pavan Balaji; Abhinav Vishnu; Rajiv Ranjan; Sherali Zeadally; Hongxiang Li

As we delve deeper into the ‘Digital Age’, we witness an explosive growth in the volume, velocity, and variety of the data available on the Internet. For example, in 2012 about 2.5 quintillion bytes of data was created on a daily basis that originated from myriad of sources and applications including mobile devices, sensors, individual archives, social networks, Internet of Things, enterprises, cameras, software logs, etc. Such ‘Data Explosions’ has led to one of the most challenging research issues of the current Information and Communication Technology era: how to optimally manage (e.g., store, replicated, filter, and the like) such large amount of data and identify new ways to analyze large amounts of data for unlocking information. It is clear that such large data streams cannot be managed by setting up on-premises enterprise database systems as it leads to a large up-front cost in buying and administering the hardware and software systems. Therefore, next generation data management systems must be deployed on cloud. The cloud computing paradigm provides scalable and elastic resources, such as data and services accessible over the Internet Every Cloud Service Provider must assure that data is efficiently processed and distributed in a way that does not compromise end-users’ Quality of Service (QoS) in terms of data availability, data search delay, data analysis delay, and the like. In the aforementioned perspective, data replication is used in the cloud for improving the performance (e.g., read and write delay) of applications that access data. Through replication a data intensive application or system can achieve high availability, better fault tolerance, and data recovery. In this paper, we survey data management and replication approaches (from 2007 to 2011) that are developed by both industrial and research communities. The focus of the survey is to discuss and characterize the existing approaches of data replication and management that tackle the resource usage and QoS provisioning with different levels of efficiencies. Moreover, the breakdown of both influential expressions (data replication and management) to provide different QoS attributes is deliberated. Furthermore, the performance advantages and disadvantages of data replication and management approaches in the cloud computing environments are analyzed. Open issues and future challenges related to data consistency, scalability, load balancing, processing and placement are also reported.

international conference on information and communication security | 2013

Secure and Private Outsourcing of Shape-Based Feature Extraction

Shumiao Wang; Mohamed Nassar; Mikhail J. Atallah; Qutaibah M. Malluhi

There has been much recent work on secure storage outsourcing, where an organization wants to store its data at untrusted remote cloud servers in an encrypted form, such that its own employees can query the encrypted data using weak devices (both computationally and storage-wise). Or a weak client wants to outsource an expensive computational task without revealing to the servers either the inputs or the computed outputs. The framework requires that the bulk of the computational burden of query-processing be placed on the remote servers, without revealing to these servers anything about the data. Most of the existing work in this area deals with non-image data that is keyword based, and the present paper is to deal with raw image data (without any keyword annotations). We demonstrate that shape-based image feature extraction, a particularly computationally intensive task, can be carried out within this framework, by presenting two schemes for doing so, and demonstrating their viability by experimentally evaluating them. Our results can be used in a number of practical situations. In one scenario the client has images and wants to securely outsource shape-based feature extraction on them, in another the server has encrypted images and the client wants a feature-extracted representation of those that are feature-rich.

IEEE Transactions on Knowledge and Data Engineering | 2016

Similarity Group-by Operators for Multi-Dimensional Relational Data

Mingjie Tang; Ruby Y. Tahboub; Walid G. Aref; Mikhail J. Atallah; Qutaibah M. Malluhi; Mourad Ouzzani; Yasin N. Silva

The SQL group-by operator plays an important role in summarizing and aggregating large datasets in a data analytics stack. The Similarity SQL-based Group-By operator (SGB, for short) extends the semantics of the standard SQL Group-by by grouping data with similar but not necessarily equal values. While existing similarity-based grouping operators efficiently realize these approximate semantics, they primarily focus on one-dimensional attributes and treat multi-dimensional attributes independently. However, correlated attributes, such as in spatial data, are processed independently, and hence, groups in the multi-dimensional space are not detected properly. To address this problem, we introduce two new SGB operators for multi-dimensional data. The first operator is the clique (or distance-to-all) SGB, where all the tuples in a group are within some distance from each other. The second operator is the distance-to-any SGB, where a tuple belongs to a group if the tuple is within some distance from any other tuple in the group. Since a tuple may satisfy the membership criterion of multiple groups, we introduce three different semantics to deal with such a case: (i) eliminate the tuple, (ii) put the tuple in any one group, and (iii) create a new group for this tuple. We implement and test the new SGB operators and their algorithms inside PostgreSQL. The overhead introduced by these operators proves to be minimal and the execution times are comparable to those of the standard Group-by. The experimental study, based on TPC-H and a social check-in data, demonstrates that the proposed algorithms can achieve up to three orders of magnitude enhancement in performance over baseline methods developed to solve the same problem.

IEEE Transactions on Knowledge and Data Engineering | 1995

Combinatorial optimization of distributed queries

Bojan Groselj; Qutaibah M. Malluhi

In relational distributed databases a query cost consists of a local cost and a transmission cost. Query optimization is a combinatorial optimization problem. As the query size grows, the optimization methods based on exhaustive search become too expensive. We propose the following strategy for solving large distributed query optimization problems in relational database systems: (1) represent each query-processing schedule by a labeled directed graph; (2) reduce the number of different schedules by pruning away invalid or high-cost solutions; and (3) find a suboptimal schedule by combinatorial optimization. We investigate several combinatorial optimization techniques: random search, single start, multistart, simulated annealing, and a combination of random search and local simulated annealing. The utility of combinatorial optimization is demonstrated in the problem of finding the (sub)optimal semijoin schedule that fully reduces all relations of a tree query. The combination of random search and local simulated annealing was superior to other tested methods.

Explore More