Walid G. Aref
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Walid G. Aref.
IEEE Transactions on Image Processing | 2001
Jianping Fan; David K. Y. Yau; Ahmed K. Elmagarmid; Walid G. Aref
We propose a new automatic image segmentation method. Color edges in an image are first obtained automatically by combining an improved isotropic edge detector and a fast entropic thresholding technique. After the obtained color edges have provided the major geometric structures in an image, the centroids between these adjacent edge regions are taken as the initial seeds for seeded region growing (SRG). These seeds are then replaced by the centroids of the generated homogeneous image regions by incorporating the required additional pixels step by step. Moreover, the results of color-edge extraction and SRG are integrated to provide homogeneous image regions with accurate and closed boundaries. We also discuss the application of our image segmentation method to automatic face detection. Furthermore, semantic human objects are generated by a seeded region aggregation procedure which takes the detected faces as object seeds.
international conference on management of data | 2004
Mohamed F. Mokbel; Xiaopeing Xiong; Walid G. Aref
This paper intoduces the Scalable INcremental hash-based Algorithm (SINA, for short); a new algorithm for evaluting a set of concurrent continuous spatio-temporal queries. SINA is designed with two goals in mind: (1) Scalability in terms of the number of concurrent continuous spatio-temporal queries, and (2) Incremental evaluation of continyous spatio-temporal queries. SINA achieves scalability by empolying a shared execution paradigm where the execution of continuous spatio-temporal queries is abstracted as a spatial join between a set of moving objects and a set of moving queries. Incremental evaluation is achived by computing only the updates of the previously reported answer. We introduce two types of updaes, namely positive and negative updates. Positive or negative updates indicate that a certain object should be added to or removed from the previously reported answer, respectively. SINA manages the computation of postive and negative updates via three phases: the hashing phase, the invalidation phase, and the joining phase. the hashing phase employs an in-memory hash-based join algorithm that results in a set a positive upldates. The invalidation phase is triggered every T seconds or when the memory is fully occupied to produce a set of negative updates. Finally, the joining phase is triggered by the end of the invalidation phase to produce a set of both positive and negative updates that result from joining in-memory data with in-disk data. Experimental results show that SINA is scalable and is more efficient than other index-based spatio-temporal algorithms.
very large data bases | 2004
Ihab F. Ilyas; Walid G. Aref; Ahmed K. Elmagarmid
Abstract.Ranking queries, also known as top-k queries, produce results that are ordered on some computed score. Typically, these queries involve joins, where users are usually interested only in the top-k join results. Top-k queries are dominant in many emerging applications, e.g., multimedia retrieval by content, Web databases, data mining, middlewares, and most information retrieval applications. Current relational query processors do not handle ranking queries efficiently, especially when joins are involved. In this paper, we address supporting top-k join queries in relational query processors. We introduce a new rank-join algorithm that makes use of the individual orders of its inputs to produce join results ordered on a user-specified scoring function. The idea is to rank the join results progressively during the join operation. We introduce two physical query operators based on variants of ripple join that implement the rank-join algorithm. The operators are nonblocking and can be integrated into pipelined execution plans. We also propose an efficient heuristic designed to optimize a top-k join query by choosing the best join order. We address several practical issues and optimization heuristics to integrate the new join operators in practical query processors. We implement the new operators inside a prototype database engine based on PREDATOR. The experimental evaluation of our approach compares recent algorithms for joining ranked inputs and shows superior performance.
IEEE Transactions on Computers | 2002
Sunil Prabhakar; Yuni Xia; Dmitri V. Kalashnikov; Walid G. Aref; Susanne E. Hambrusch
Moving object environments are characterized by large numbers of moving objects and numerous concurrent continuous queries over these objects. Efficient evaluation of these queries in response to the movement of the objects is critical for supporting acceptable response times. In such environments, the traditional approach of building an index on the objects (data) suffers from the need for frequent updates and thereby results in poor performance. In fact, a brute force, no-index strategy yields better performance in many cases. Neither the traditional approach nor the brute force strategy achieve reasonable query processing times. This paper develops novel techniques for the efficient and scalable evaluation of multiple continuous queries on moving objects. Our solution leverages two complimentary techniques: Query Indexing and Velocity Constrained Indexing (VCI). Query Indexing relies on 1) incremental evaluation, 2) reversing the role of queries and data, and 3) exploiting the relative locations of objects and queries. VCI takes advantage of the maximum possible speed of objects in order to delay the expensive operation of updating an index to reflect the movement of objects. In contrast to an earlier technique that requires exact knowledge about the movement of the objects, VCI does not rely on such information. While Query Indexing outperforms VCI, it does not efficiently handle the arrival of new queries. Velocity constrained indexing, on the other hand, is unaffected by changes in queries. We demonstrate that a combination of Query Indexing and Velocity Constrained Indexing enables the scalable execution of insertion and deletion of queries in addition to processing ongoing queries. We also develop several optimizations and present a detailed experimental evaluation of our techniques. The experimental results show that the proposed schemes outperform the traditional approaches by almost two orders of magnitude.
international conference on data engineering | 2005
Xiaopeng Xiong; Mohamed F. Mokbel; Walid G. Aref
Location-aware environments are characterized by a large number of objects and a large number of continuous queries. Both the objects and continuous queries may change their locations over time. In this paper, we focus on continuous k-nearest neighbor queries (CKNN, for short). We present a new algorithm, termed SEA-CNN, for answering continuously a collection of concurrent CKNN queries. SEA-CNN has two important features: incremental evaluation and shared execution. SEA-CNN achieves both efficiency and scalability in the presence of a set of concurrent queries. Furthermore, SEA-CNN does not make any assumptions about the movement of objects, e.g., the objects velocities and shapes of trajectories, or about the mutability of the objects and/or the queries, i.e., moving or stationary queries issued on moving or stationary objects. We provide theoretical analysis of SEA-CNN with respect to the execution costs, memory requirements and effects of tunable parameters. Comprehensive experimentation shows that SEA-CNN is highly scalable and is more efficient in terms of both I/O and CPU costs in comparison to other R-tree-based CKNN techniques.
IEEE Computer | 2001
James B. D. Joshi; Arif Ghafoor; Walid G. Aref; Eugene H. Spafford
The authors propose an approach that provides a theoretical foundation for the use of object-oriented databases and object-relational databases in data warehouse, multidimensional database, and online analytical processing applications. This approach introduces a set of minimal constraints and extensions to the Unified Modeling Language for representing multidimensional modeling properties for these applications. Multidimensional modeling offers two benefits. First, the model closely parallels how data analyzers think and, therefore, helps users understand data. Second, multidimensional modeling helps predict what final users want to do, thereby facilitating performance improvements. The authors are using their approach to create an automatic implementation of a multidimensional model. They plan to integrate commercial online-analytical-processing tool facilities within their GOLD model case tool as well, a task that involves data warehouse prototyping and sample data generation issues.M ost developers agree that data warehouse, multidimensional database (MDB), and online analytical processing (OLAP) applications emphasize multidimen-sional modeling, which offers two benefits. First, the multidimensional model closely parallels how data analyzers think and, therefore, helps users understand data. Second, this approach helps predict what final users want to do, thereby facilitating performance improvements. Developers have proposed various approaches for the conceptual design of multidimensional systems. These proposals try to represent the main multidi-mensional properties at the conceptual level with special emphasis on data structures. A conceptual modeling approach for data warehouses , however, should also address other relevant aspects such as initial user requirements, system behavior , available data sources, and specific issues related to automatic generation of the database schemes. We believe that object orientation with the Unified Modeling Language can provide an adequate notation for modeling every aspect of a data warehouse system from user requirements to implementation. We propose an OO approach to accomplish the conceptual modeling of data warehouses, MDB, and OLAP applications. This approach introduces a set of minimal constraints and extensions to UML 1 for representing multidimensional modeling properties for these applications. We base these extensions on the standard mechanisms that UML provides for adapting itself to a specific method or model, such as constraints and tagged values. Our work builds on previous research, 2-4 which provided a foundation for the results we report here and for earlier versions of our work. We believe that our innovative approach provides a theoretical foundation for the use of OO databases and object-relational databases in data warehouses, MDB, and OLAP applications. We use UML to design data warehouses because it considers an information systems structural and dynamic properties at the conceptual level more naturally than do classic approaches such as the Entity-Relationship model. Further, UML provides powerful mechanisms—such as the Object Constraint Language 1 and the Object Query Language 1 —for embedding data warehouse constraints and initial user requirements in the conceptual model. This approach to modeling a data warehouse system yields simple yet powerful extended UML class diagrams that represent main data warehouse properties at the conceptual level. Multidimensional modeling structures information into facts and dimensions. We define a fact as an item of interest for an enterprise, and describe it through a set of attributes called measures or fact attributes—atomic or derived—which are contained in cells or points within the data cube. We base …
Communications of The ACM | 2001
James B. D. Joshi; Walid G. Aref; Arif Ghafoor; Eugene H. Spafford
Using traditional and emerging access control approaches to develop secure applications for the Web.
very large data bases | 2003
Moustafa A. Hammad; Michael J. Franklin; Walid G. Aref; Ahmed K. Elmagarmid
Continuous Query (CQ) systems typically exploit commonality among query expressions to achieve improved efficiency through shared processing. Recently proposed CQ systems have introduced window specifications in order to support unbounded data streams. There has been, however, little investigation of sharing for windowed query operators. In this paper, we address the shared execution of windowed joins, a core operator for CQ systems. We show that the strategy used in systems to date has a previously unreported performance flaw that can negatively impact queries with relatively small windows. We then propose two new execution strategies for shared joins. We evaluate the alternatives using both analytical models and implementation in a DBMS. The results show that one strategy, called MQT, provides the best performance over a range of workload settings.
IEEE Transactions on Multimedia | 2004
Jianping Fan; Ahmed K. Elmagarmid; Xingquan Zhu; Walid G. Aref; Lide Wu
Recent advances in digital video compression and networks have made video more accessible than ever. However, the existing content-based video retrieval systems still suffer from the following problems. 1) Semantics-sensitive video classification problem because of the semantic gap between low-level visual features and high-level semantic visual concepts; 2) Integrated video access problem because of the lack of efficient video database indexing, automatic video annotation, and concept-oriented summary organization techniques. In this paper, we have proposed a novel framework, called ClassView, to make some advances toward more efficient video database indexing and access. 1) A hierarchical semantics-sensitive video classifier is proposed to shorten the semantic gap. The hierarchical tree structure of the semantics-sensitive video classifier is derived from the domain-dependent concept hierarchy of video contents in a database. Relevance analysis is used for selecting the discriminating visual features with suitable importances. The Expectation-Maximization (EM) algorithm is also used to determine the classification rule for each visual concept node in the classifier. 2) A hierarchical video database indexing and summary presentation technique is proposed to support more effective video access over a large-scale database. The hierarchical tree structure of our video database indexing scheme is determined by the domain-dependent concept hierarchy which is also used for video classification. The presentation of visual summary is also integrated with the inherent hierarchical video database indexing tree structure. Integrating video access with efficient database indexing tree structure has provided great opportunity for supporting more powerful video search engines.
ACM Transactions on Database Systems | 2009
Chi-Yin Chow; Mohamed F. Mokbel; Walid G. Aref
In this article, we present a new privacy-aware query processing framework, Capser*, in which mobile and stationary users can obtain snapshot and/or continuous location-based services without revealing their private location information. In particular, we propose a privacy-aware query processor embedded inside a location-based database server to deal with snapshot and continuous queries based on the knowledge of the users cloaked location rather than the exact location. Our proposed privacy-aware query processor is completely independent of how we compute the users cloaked location. In other words, any existing location anonymization algorithms that blur the users private location into cloaked rectilinear areas can be employed to protect the users location privacy. We first propose a privacy-aware query processor that not only supports three new privacy-aware query types, but also achieves a trade-off between query processing cost and answer optimality. Then, to improve system scalability of processing continuous privacy-aware queries, we propose a shared execution paradigm that shares query processing among a large number of continuous queries. The proposed scalable paradigm can be tuned through two parameters to trade off between system scalability and answer optimality. Experimental results show that our query processor achieves high quality snapshot and continuous location-based services while supporting queries and/or data with cloaked locations.