Ahmed K. Elmagarmid | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ahmed K. Elmagarmid is active.

Explore More

Publication

Featured researches published by Ahmed K. Elmagarmid.

IEEE Transactions on Knowledge and Data Engineering | 2007

Duplicate Record Detection: A Survey

Ahmed K. Elmagarmid; Panagiotis G. Ipeirotis; Vassilios S. Verykios

Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats, or any combination of these factors. In this paper, we present a thorough analysis of the literature on duplicate record detection. We cover similarity metrics that are commonly used to detect similar field entries, and we present an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database. We also cover multiple techniques for improving the efficiency and scalability of approximate duplicate detection algorithms. We conclude with coverage of existing tools and with a brief discussion of the big open problems in the area

very large data bases | 2003

Composing Web services on the Semantic Web

Brahim Medjahed; Athman Bouguettaya; Ahmed K. Elmagarmid

Abstract.Service composition is gaining momentum as the potential silver bullet for the envisioned Semantic Web. It purports to take the Web to unexplored efficiencies and provide a flexible approach for promoting all types of activities in tomorrow’s Web. Applications expected to heavily take advantage of Web service composition include B2B E-commerce and E-government. To date, enabling composite services has largely been an ad hoc, time-consuming, and error-prone process involving repetitive low-level programming. In this paper, we propose an ontology-based framework for the automatic composition of Web services. We present a technique to generate composite services from high-level declarative descriptions. We define formal safeguards for meaningful composition through the use of composability rules. These rules compare the syntactic and semantic features of Web services to determine whether two services are composable. We provide an implementation using an E-government application offering customized services to indigent citizens. Finally, we present an exhaustive performance experiment to assess the scalability of our approach.

ACM Computing Surveys | 1999

Client-server computing in mobile environments

Jin Jing; Abdelsalam Helal; Ahmed K. Elmagarmid

Recent advances in wireless data networking and portable information appliances have engendered a new paradigm of computing, called mobile computing, in which users carrying portable devices have access to data and information services regardless of their physical location or movement behavior. In the meantime, research addressing information access in mobile environments has proliferated. In this survey, we provide a concrete framework and categorization of the various ways of supporting mobile client-server computing for information access. We examine characteristics of mobility that distinguish mobile client-server computing from its traditional counterpart. We provide a comprehensive analysis of new paradigms and enabler concepts for mobile client-server computing, including mobile-aware adaptation, extended client-server model, and mobile data access. A comparative and detailed review of major research prototypes for mobile information access is also presented.

IEEE Transactions on Image Processing | 2001

Automatic image segmentation by integrating color-edge extraction and seeded region growing

Jianping Fan; David K. Y. Yau; Ahmed K. Elmagarmid; Walid G. Aref

We propose a new automatic image segmentation method. Color edges in an image are first obtained automatically by combining an improved isotropic edge detector and a fast entropic thresholding technique. After the obtained color edges have provided the major geometric structures in an image, the centroids between these adjacent edge regions are taken as the initial seeds for seeded region growing (SRG). These seeds are then replaced by the centroids of the generated homogeneous image regions by incorporating the required additional pixels step by step. Moreover, the results of color-edge extraction and SRG are integrated to provide homogeneous image regions with accurate and closed boundaries. We also discuss the application of our image segmentation method to automatic face detection. Furthermore, semantic human objects are generated by a seeded region aggregation procedure which takes the detected faces as object seeds.

IEEE Transactions on Knowledge and Data Engineering | 2004

Association rule hiding

Vassilios S. Verykios; Ahmed K. Elmagarmid; Elisa Bertino; Yücel Saygin; Elena Dasseni

Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. We investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.

very large data bases | 2003

Business-to-business interactions: issues and enabling technologies

Brahim Medjahed; Boualem Benatallah; Athman Bouguettaya; Anne H. H. Ngu; Ahmed K. Elmagarmid

Abstract. Business-to-Business (B2B) technologies pre-date the Web. They have existed for at least as long as the Internet. B2B applications were among the first to take advantage of advances in computer networking. The Electronic Data Interchange (EDI) business standard is an illustration of such an early adoption of the advances in computer networking. The ubiquity and the affordability of the Web has made it possible for the masses of businesses to automate their B2B interactions. However, several issues related to scale, content exchange, autonomy, heterogeneity, and other issues still need to be addressed. In this paper, we survey the main techniques, systems, products, and standards for B2B interactions. We propose a set of criteria for assessing the different B2B interaction techniques, standards, and products.

very large data bases | 2004

Supporting top- k join queries in relational databases

Ihab F. Ilyas; Walid G. Aref; Ahmed K. Elmagarmid

Abstract.Ranking queries, also known as top-k queries, produce results that are ordered on some computed score. Typically, these queries involve joins, where users are usually interested only in the top-k join results. Top-k queries are dominant in many emerging applications, e.g., multimedia retrieval by content, Web databases, data mining, middlewares, and most information retrieval applications. Current relational query processors do not handle ranking queries efficiently, especially when joins are involved. In this paper, we address supporting top-k join queries in relational query processors. We introduce a new rank-join algorithm that makes use of the individual orders of its inputs to produce join results ordered on a user-specified scoring function. The idea is to rank the join results progressively during the join operation. We introduce two physical query operators based on variants of ripple join that implement the rank-join algorithm. The operators are nonblocking and can be integrated into pipelined execution plans. We also propose an efficient heuristic designed to optimize a top-k join query by choosing the best join order. We address several practical issues and optimization heuristics to integrate the new join operators in practical query processors. We implement the new operators inside a prototype database engine based on PREDATOR. The experimental evaluation of our approach compares recent algorithms for joining ranked inputs and shows superior performance.

information hiding | 2001

Hiding Association Rules by Using Confidence and Support

Elena Dasseni; Vassilios S. Verykios; Ahmed K. Elmagarmid; Elisa Bertino

Large repositories of data contain sensitive information which must be protected against unauthorized access. Recent advances, in data mining and machine learning algorithms, have increased the disclosure risks one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, which are called association rules. If the disclosure risk of some of these rules are above a certain privacy threshold, those rules must be characterized as sensitive. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferencing sensitive data, or they may provide business competitors with an advantage.

Mobile Networks and Applications | 1997

Bit-sequences: an adaptive cache invalidation method in mobile client/server environments

Jin Jing; Ahmed K. Elmagarmid; Abdelsalam Helal; Rafael Alonso

In this paper, we present Bit-Sequences (BS), an adaptive cache invalidation algorithm for client/server mobile environments. The algorithm uses adaptable mechanisms to adjust the size of the invalidation report to optimize the use of a limited communication bandwidth while retaining the effectiveness of cache invalidation. The proposed BS algorithm is especially suited for dissemination-based (or “server-push”-based) nomadic information service applications. The critical aspect of our algorithm is its self-adaptability and effectiveness, regardless of the connectivity behavior of the mobile clients. The performance of BS is analyzed through a simulation study that compares BS’s effectiveness with that of a hypothetical optimal cache invalidation algorithm.

international workshop on research issues in data engineering | 2002

Privacy preserving association rule mining

Yücel Saygin; Vassilios S. Verykios; Ahmed K. Elmagarmid

The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes the security of information that is shared between the broker, the requester, and the provider at runtime. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in an enormous amount of data, impose new threats on the seamless integration of information. We consider the problem of building privacy preserving algorithms for one category of data mining techniques, association rule mining. We introduce new metrics in order to demonstrate how security issues can be taken into consideration in the general framework of association rule mining, and we show that the complexity of the new heuristics is similar to that of the original algorithms.

Explore More