Mohamed E. Khalefa
University of Minnesota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohamed E. Khalefa.
international conference on data engineering | 2010
Justin J. Levandoski; Mohamed F. Mokbel; Mohamed E. Khalefa
Personalized database systems give users answers tailored to their personal preferences. While numerous preference evaluation methods for databases have been proposed (e.g., skyline, top-k, k-dominance, k-frequency), the implementation of these methods at the core of a database system is a double-edged sword. Core implementation provides efficient query processing for arbitrary database queries, however this approach is not practical as each existing (and future) preference method requires a custom query processor implementation. To solve this problem, this paper introduces FlexPref, a framework for extensible preference evaluation in database systems. FlexPref, implemented in the query processor, aims to support a wide-array of preference evaluation methods in a single extensible code base. Integration with FlexPref is simple, involving the registration of only three functions that capture the essence of the preference method. Once integrated, the preference method “lives” at the core of the database, enabling the efficient execution of preference queries involving common database operations. To demonstrate the extensibility of FlexPref, we provide case studies showing the implementation of three database operations (single table access, join, and sorted list access) and five state-of-the-art preference evaluation methods (top-k, skyline, k-dominance, top-k dominance, and k-frequency). We also experimentally study the strengths and weaknesses of an implementation of FlexPef in PostgreSQL over a range of single-table and multi-table preference queries.
conference on information and knowledge management | 2010
Mohamed E. Khalefa; Mohamed F. Mokbel; Justin J. Levandoski
Recently, several research efforts have addressed answering skyline queries efficiently over large datasets. However, this research lacks methods to compute these queries over uncertain data, where uncertain values are represented as a range. In this paper, we define skyline queries over continuous uncertain data, and propose a novel, efficient framework to answer these queries. Query answers are probabilistic, where each object is associated with a probability value of being a query answer. Typically, users specify a probability threshold, that each returned object must exceed, and a tolerance value that defines the allowed error margin in probability calculation to reduce the computational overhead. Our framework employs an efficient two-phase query processing algorithm.
international conference on data engineering | 2011
Mohamed E. Khalefa; Mohamed F. Mokbel; Justin J. Levandoski
Preference queries are essential to a wide spectrum of applications including multi-criteria decision-making tools and personalized databases. Unfortunately, most of the evaluation techniques for preference queries assume that the set of preferred attributes are stored in only one relation, waiving on a wide set of queries that include preference computations over multiple relations. This paper presents PrefJoin, an efficient preference-aware join query operator, designed specifically to deal with preference queries over multiple relations. PrefJoin consists of four main phases: Local Pruning, Data Preparation, Joining, and Refining that filter out, from each input relation, those tuples that are guaranteed not to be in the final preference set, associate meta data with each non-filtered tuple that will be used to optimize the execution of the next phases, produce a subset of join result that are relevant for the given preference function, and refine these tuples respectively. An interesting characteristic of PrefJoin is that it tightly integrates preference computation with join hence we can early prune those tuples that are guaranteed not to be an answer, and hence it saves significant unnecessary computations cost. PrefJoin supports a variety of preference function including skyline, multi-objective and k-dominance preference queries. We show the correctness of PrefJoin. Experimental evaluation based on a real system implementation inside PostgreSQL shows that PrefJoin consistently achieves from one to three orders of magnitude performance gain over its competitors in various scenarios.
very large data bases | 2010
Justin J. Levandoski; Mohamed F. Mokbel; Mohamed E. Khalefa
In this paper, we aim to realize a context and preference-aware database system, CareDB, that provides scalable personalized location-based services to users based on their preferences and current surrounding context. Unlike existing location-based database systems that answer queries based solely on proximity in distance, CareDB considers user preferences and various types of context in determining the answer to location-based queries. To this end, CareDB does not aim to define new location-based queries, instead, it aims to redefine the answer of existing location-based queries. The PhD thesis topics covered in this paper solve novel, core systems issues that help realize CareDB. These issues are: (1) efficient and extensible core DBMS query processor support for numerous preference evaluation methods, (2) core dbms support for preference query processing in the face of expensive contextual data, and (3) support for continuous preference and context-aware query processing.We demonstrate CareDB, a context and preference-aware database system. CareDB provides scalable personalized location-based services to users based on their preferences and current surrounding context. Unlike existing location-based database systems that answer queries based solely on proximity in distance, CareDB considers user preferences and various types of context in determining the answer to location-based queries. To this end, CareDB does not aim to define new location-based queries, instead, it aims to redefine the answer of existing location-based queries. To achieve its goals, CareDB has several distinguishing characteristics that revolve around a generic and extensible preference and context-aware query processing framework that addresses (a) scalable, efficient preference joins, (b) gracefully handling contextual attributes that are expensive to derive, and (c) support for uncertain attributes.
ACM Transactions on Database Systems | 2013
Justin J. Levandoski; Ahmed Eldawy; Mohamed F. Mokbel; Mohamed E. Khalefa
Personalized database systems give users answers tailored to their personal preferences. While numerous preference evaluation methods for databases have been proposed (e.g., skyline, top-k, k-dominance, k-frequency), the implementation of these methods at the core of a database system is a double-edged sword. Core implementation provides efficient query processing for arbitrary database queries, however, this approach is not practical since each existing (and future) preference method requires implementation within the database engine. To solve this problem, this article introduces FlexPref, a framework for extensible preference evaluation in database systems. FlexPref, implemented in the query processor, aims to support a wide array of preference evaluation methods in a single extensible code base. Integration with FlexPref is simple, involving the registration of only three functions that capture the essence of the preference method. Once integrated, the preference method “lives” at the core of the database, enabling the efficient execution of preference queries involving common database operations. This article also provides a query optimization framework for FlexPref, as well as a theoretical framework that defines the properties a preference method must exhibit to be implemented in FlexPref. To demonstrate the extensibility of FlexPref, this article also provides case studies detailing the implementation of seven state-of-the-art preference evaluation methods within FlexPref. We also experimentally study the strengths and weaknesses of an implementation of FlexPref in PostgreSQL over a range of single-table and multitable preference queries.
IEEE Transactions on Knowledge and Data Engineering | 2011
Justin K. Levandoski; Mohamed E. Khalefa; Mohamed F. Mokbel
This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.
international conference on data engineering | 2008
Justin J. Levandoski; Mohamed E. Khalefa; Mohamed F. Mokbel
This paper introduces an efficient algorithm for Producing Early Results in Multi-join query plans (PermJoin, for short). While most previous research focuses only on the case of a single join operator, PermJoin takes a radical step by addressing query plans with multiple join operators. PermJoin is optimized to maximize the early overall throughput and to adapt to fluctuations in data arrival rates. PermJoin is a non- blocking operator that is capable of producing join results even if one or more data sources are blocked due to slow or bursty network behavior. Furthermore, PermJoin distinguishes itself from all previous techniques as it: (1) employs a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multi-join queries, and (2) employs a novel state manager module that adaptively switches operators between joining in-memory data and disk-resident data in order to maximize overall throughput.
statistical and scientific database management | 2016
Mohamed E. Khalefa; Sameh S. El-Atawy
We demonstrate a KDBMS, a prototype system which seamlessly integrates Knowledge base and DBMS. While state-of-the-art approaches, i.e., Ontology-based data access, denoted as OBDA, use ontologies to only query data stored in relational databases using SPARQL. In this demo, we present a high level description of the proposed system, introduce a new knowledge-based query language, denoted as KQL, and highlight some query optimization opportunities by employing knowledge across database layers in query optimization, and query processing, while ease the administrating for a complex database schema.
Proceedings of the 2nd Africa and Middle East Conference on Software Engineering | 2016
Sameh S. El-Atawy; Mohamed E. Khalefa
Electronic health record (EHR) solutions are complex, spanning multiple specialties and domains of expertise. These systems need to handle clinical concepts, temporal data, documents, and financial transactions, which leads to a large code base that is tightly coupled with data models and inherently hard to maintain. These difficulties can greatly increase the cost of developing EHR systems, result in a high failure rate of implementation, and threaten investments in this sector. Moreover, due to the wide variance in the level of detail across different settings, data exchange is becoming a serious problem, further increasing the cost of development and maintenance. To overcome these issues, we adopt ontologies to model our proposed EHR solution, not only allowing code reuse; but also enabling later extension and customization. Adopting software factory techniques, we build tools to transform ontological models into deployment-ready code. This automatically provides handling of data persistence, access, and exchange. Business logic is expressed as ontology-based process flows and rules, ensuring data quality and supporting special needs. This logic is enforced transparently and can be modified on the fly. We optimized the user experience by facilitating fast data entry and retrieval. In this paper, we present the requirements of an effective EHR solution, explain the techniques we employed, describe the main modules of our proposed system, and discuss the technical decisions we made.
international conference on data engineering | 2008
Mohamed E. Khalefa; Mohamed F. Mokbel; Justin J. Levandoski