Reda Alhajj
University of Calgary
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Reda Alhajj.
Fuzzy Sets and Systems | 2005
Mehmet Kaya; Reda Alhajj
It is not an easy task to know a priori the most appropriate fuzzy sets that cover the domains of quantitative attributes for fuzzy association rules mining, simply because characteristics of quantitative data are in general unknown. Besides, it is unrealistic that the most appropriate fuzzy sets can always be provided by domain experts. Motivated by this, in this paper we propose an automated method for mining fuzzy association rules. For this purpose, we first present a genetic algorithm (GA) based clustering method that adjusts centroids of the clusters, which are to be handled later as midpoints of triangular membership functions. Next, we give a different method for generating the membership functions by using Clustering Using Representatives (CURE) clustering algorithm, which is known as one of the most efficient clustering algorithms described in the literature. Finally, we compared the proposed GA-based approach with other approaches from the literature. Experiments conducted on 100K transactions from the US census in the year 2000 show that the proposed method exhibits a good performance in terms of execution time and interesting fuzzy association rules.
Journal of Network and Computer Applications | 2007
Tansel Özyer; Reda Alhajj; Ken Barker
The purpose of the work described in this paper is to provide an intelligent intrusion detection system (IIDS) that uses two of the most popular data mining tasks, namely classification and association rules mining together for predicting different behaviors in networked computers. To achieve this, we propose a method based on iterative rule learning using a fuzzy rule-based genetic classifier. Our approach is mainly composed of two phases. First, a large number of candidate rules are generated for each class using fuzzy association rules mining, and they are pre-screened using two rule evaluation criteria in order to reduce the fuzzy rule search space. Candidate rules obtained after pre-screening are used in genetic fuzzy classifier to generate rules for the classes specified in IIDS: namely Normal, PRB-probe, DOS-denial of service, U2R-user to root and R2L-remote to local. During the next stage, boosting genetic algorithm is employed for each class to find its fuzzy rules required to classify data each time a fuzzy rule is extracted and included in the system. Boosting mechanism evaluates the weight of each data item to help the rule extraction mechanism focus more on data having relatively more weight, i.e., uncovered less by the rules extracted until the current iteration. Each extracted fuzzy rule is assigned a weight. Weighted fuzzy rules in each class are aggregated to find the vote of each class label for each data item.
systems man and cybernetics | 2000
Osman Abul; Faruk Polat; Reda Alhajj
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning is a generic method that suits the needs of MA learning in many aspects. This paper presents two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agents do not require explicit communication among themselves to learn coordinated behavior. The first coordination mechanism is the perceptual coordination mechanism, where other agents are included in state descriptions and coordination information is learned from state transitions. The second is the observing coordination mechanism, which also includes other agents in state descriptions and additionally the rewards of nearby agents are observed from the environment. The observed rewards and agents own reward are used to construct an optimal policy. This way, the latter mechanism tends to increase region-wide joint rewards. The selected experimented domain is adversarial food-collecting world (AFCW), which can be configured both as single and multiagent environments. Function approximation and generalization techniques are used because of the huge state space. Experimental results show the effectiveness of these mechanisms.
IEEE Transactions on Knowledge and Data Engineering | 2011
Faras Rasheed; Mohammed Alshalalfa; Reda Alhajj
Periodic pattern mining or periodicity detection has a number of applications, such as prediction, forecasting, detection of unusual activities, etc. The problem is not trivial because the data to be analyzed are mostly noisy and different periodicity types (namely symbol, sequence, and segment) are to be investigated. Accordingly, we argue that there is a need for a comprehensive approach capable of analyzing the whole time series or in a subsection of it to effectively handle different types of noise (to a certain degree) and at the same time is able to detect different types of periodic patterns; combining these under one umbrella is by itself a challenge. In this paper, we present an algorithm which can detect symbol, sequence (partial), and segment (full cycle) periodicity in time series. The algorithm uses suffix tree as the underlying data structure; this allows us to design the algorithm such that its worstcase complexity is O(k.n2), where k is the maximum length of periodic pattern and n is the length of the analyzed portion (whole or subsection) of the time series. The algorithm is noise resilient; it has been successfully demonstrated to work with replacement, insertion, deletion, or a mixture of these types of noise. We have tested the proposed algorithm on both synthetic and real data from different domains, including protein sequences. The conducted comparative study demonstrate the applicability and effectiveness of the proposed algorithm; it is generally more time-efficient and noise-resilient than existing algorithms.
Information Systems | 2003
Reda Alhajj
The maintenance of an existing database depends on the depth of understanding of its characteristics. Such an understanding is easily lost when the developers disperse. The situation becomes worse when the related documentation is missing. This paper addresses this issue by extracting the extended entity-relationship schema from the relational schema. We developed algorithms that investigate characteristics of an existing legacy database in order to identify candidate keys of all relations in the relational schema, to locate foreign keys, and to decide on the appropriate links between the given relations. Based on this analysis, a graph consistent with the entity-relationship diagram is derived to contain all possible uniary and binary relationships between the given relations. The minimum and maximum cardinalities of each link in the mentioned graph are determined, and extra links within the graph are identified and categorized, if any. The latter information is necessary to optimize foreign keys related information Finally, the last steps in the process involve~(when applicable) suggesting improvements on the original conceptual design, deciding on relationships with attributes, many-to-many and n-ary (n ≥ 3) relationships, and identifying is-a links. User involvement in the process is minimized to the case of having multiple choices, where the system does not have the semantic knowledge required to decide on a certain choice.
ieee international conference on fuzzy systems | 2003
Mehmet Kaya; Reda Alhajj
In this paper, we propose genetic algorithms (GAs) based clustering method, which dynamically adjusts the fuzzy sets to provide maximum profit within an interval of user specified minimum support values. This is achieved by tuning the base values of the membership functions for each quantitative attribute so as to maximize the sum of large itemsets in a certain interval of minimum support values. To the best of our knowledge, this is the first effort in this direction. To support our claim, we compare the proposed GAs-based approach with a CURE-based approach. Experimental results on synthetic transactions show that the proposed clustering method exhibits a good performance over CURE-based approach in terms of the number of produced large itemsets and interesting association rules.
acm symposium on applied computing | 2006
Abhishek Gaurav; Reda Alhajj
In this paper, we present an approach for incorporating fuzzy and imprecise data in XML documents. We describe the ways to introduce fuzziness using both possibility theory and similarity relations. We then show how to map the data from a fuzzy relational database into a fuzzy XML document, with the corresponding XML schema. This approach will aid in liberating the data stored in fuzzy relational databases onto the web as fuzzy XML documents.
Applied Intelligence | 2006
Mehmet Kaya; Reda Alhajj
It is not an easy task to know a priori the most appropriate fuzzy sets that cover the domains of quantitative attributes for fuzzy association rules mining. In general, it is unrealistic that experts can always provide such sets. And finding the most appropriate fuzzy sets becomes a more complex problem when items are not considered to have equal importance and the support and confidence parameters required for the association rules mining process are specified as linguistic terms. Existing clustering based automated methods are not satisfactory because they do not consider the optimization of the discovered membership functions. In order to tackle this problem, we propose Genetic Algorithms (GAs) based clustering method, which dynamically adjusts the fuzzy sets to provide maximum profit based on user specified linguistic minimum support and confidence terms. This is achieved by tuning the base values of the membership functions for each quantitative attribute with respect to two different evaluation functions maximizing the number of large itemsets and the average of the confidence intervals of the generated rules. To the best of our knowledge, this is the first effort in this direction. Experiments conducted on 100 K transactions from the adult database of United States census in year 2000 demonstrate that the proposed clustering method exhibits good performance in terms of the number of produced large itemsets and interesting association rules.
information reuse and integration | 2004
Kevin Miller; Chris Gee; Ryan Inaba; Tansel Özyer; Anthony Chiu Wa Lo; Reda Alhajj
In today fast pace, technologically based world, people need access to information on the go, anywhere at anytime. Because of this, mobile databases are greatly increasing in popularity. Using mobile devices, users can access and modify information without a network connection, and can then synchronize any updates when they get connected. This synchronization process must not only be able to perform updates on data from multiple users, but must also be able to detect and resolve conflicts. This process can sometimes be expensive and the synchronization algorithm must not only be robust, but also efficient. Motivated by this, we present DeferredSync as a new efficient XML based synchronization technique. DeferredSync converts relational data into XML tree structure and then makes use of deferred views in order to minimize bandwidth and storage space for the client. Finally, a conflict detection and resolution algorithm using priority selection to synchronize the data is outlined.
Expert Systems With Applications | 2012
Ela Yildizer; Ali Metin Balci; Mohammad Hassan; Reda Alhajj
Highlights? Effective CBIR for non-texture images. ? An extremely fast CBIR system which uses Multiple Support Vector Machines Ensemble. ? Using Daubechies wavelet transformation for extracting the feature vectors of images. With the evolution of digital technology, there has been a significant increase in the number of images stored in electronic format. These range from personal collections to medical and scientific images that are currently collected in large databases. Many users and organizations now can acquire large numbers of images and it has been very important to retrieve relevant multimedia resources and to effectively locate matching images in the large databases. In this context, content-based image retrieval systems (CBIR) have become very popular for browsing, searching and retrieving images from a large database of digital images with minimum human intervention. The research community are competing for more efficient and effective methods as CBIR systems may be heavily employed in serving time critical applications in scientific and medical domains. This paper proposes an extremely fast CBIR system which uses Multiple Support Vector Machines Ensemble. We have used Daubechies wavelet transformation for extracting the feature vectors of images. The reported test results are very promising. Using data mining techniques not only improved the efficiency of the CBIR systems, but they also improved the accuracy of the overall process.