Osmar R. Zaïane | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Osmar R. Zaïane is active.

Explore More

Publication

Featured researches published by Osmar R. Zaïane.

Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98- | 1998

Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs

Osmar R. Zaïane; Man Xin; Jiawei Han

As a confluence of data mining and World Wide Web technologies, it is now possible to perform data mining on Web log records collected from the Internet Web-page access history. The behaviour of Web page readers is imprinted in the Web server log files. Analyzing and exploring regularities in this behaviour can improve the system performance, enhance the quality and delivery of Internet information services to the end user, and identify populations of potential customers for electronic commerce. Thus, by observing people using collections of data, data mining can bring a considerable contribution to digital library designers. In a joint effort between the TeleLearning-NCE (Networks of Centres of Excellence) project on the Virtual University and the NCE-IRIS project on data mining, we have been developing a knowledge discovery tool, called WebLogMiner, for mining Web server log files. This paper presents the design of WebLogMiner, reports current progress and outlines future work in this direction.

Archive | 2001

Web Usage Mining for a Better Web-Based Learning Environment

Osmar R. Zaïane

Web-based technology is often the technology of choice for distance education given the ease of use of the tools to browse the resources on the Web, the relative affordability of accessing the ubiquitous Web, and the simplicity of deploying and maintaining resources on the WorldWide Web. Many sophisticated web-based learning environments have been developed and are in use around the world. The same technology is being used for electronic commerce and has become extremely popular. However, while there are clever tools developed to understand online customer’s behaviours in order to increase sales and profit, there is very little done to automatically discover access patterns to understand learners’ behaviour on webbased distance learning. Educators, using on-line learning environments and tools, have very little support to evaluate learners’ activities and discriminate between different learners’ on-line behaviours. In this paper, we discuss some data mining and machine learning techniques that could be used to enhance web-based learning environments for the educator to better evaluate the leaning process, as well as for the learners to help them in their learning endeavour.

international conference on data mining | 2002

Text document categorization by term association

Maria-Luiza Antonie; Osmar R. Zaïane

A good text classifier is a classifier that efficiently categorizes large sets of text documents in a reasonable time frame and with an acceptable accuracy, and that provides classification rules that are human readable for possible fine-tuning. If the training of the classifier is also quick, this could become in some application domains a good asset for the classifier. Many techniques and algorithms for automatic text categorization have been devised. According to published literature, some are more accurate than others, and some provide more interpretable classification models than others. However, none can combine all the beneficial properties enumerated above. In this paper we present a novel approach for automatic text categorization that borrows from market basket analysis techniques using association rule mining in the data-mining field. We focus on two major problems: (1) finding the best term association rules in a textual database by generating and pruning; and (2) using the rules to build a text classifier. Our text categorization method proves to be efficient and effective, and experiments on well-known collections show that the classifier performs well. In addition, training as well as classification are both fast and the generated rules are human readable.

international database engineering and applications symposium | 2003

Incremental mining of frequent patterns without candidate generation or support constraint

William Cheung; Osmar R. Zaïane

In this paper, we propose a novel data structure called CATS Tree. CATS Tree extends the idea of FPTree to improve storage compression and allow frequent pattern mining without generation of candidate item sets. The proposed algorithms enable frequent pattern mining with different supports without rebuilding the tree structure. Furthermore, the algorithms allow mining with a single pass over the database as well as efficient insertion or deletion of transactions at any time.

IEEE Transactions on Knowledge and Data Engineering | 2009

Clustering and Sequential Pattern Mining of Online Collaborative Learning Data

Dilhan Perera; Judy Kay; Irena Koprinska; Kalina Yacef; Osmar R. Zaïane

Group work is widespread in education. The growing use of online tools supporting group work generates huge amounts of data. We aim to exploit this data to support mirroring: presenting useful high-level views of information about the group, together with desired patterns characterizing the behavior of strong groups. The goal is to enable the groups and their facilitators to see relevant aspects of the groups operation and provide feedback if these are more likely to be associated with positive or negative outcomes and indicate where the problems are. We explore how useful mirror information can be extracted via a theory-driven approach and a range of clustering and sequential pattern mining. The context is a senior software development project where students use the collaboration tool TRAC. We extract patterns distinguishing the better from the weaker groups and get insights in the success factors. The results point to the importance of leadership and group interaction, and give promising indications if they are occurring. Patterns indicating good individual practices were also identified. We found that some key measures can be mined from early data. The results are promising for advising groups at the start and early identification of effective and poor practices, in time for remediation.Group work is widespread in education. The growing use of online tools supporting group work generates huge amounts of data. We aim to exploit this data to support mirroring: presenting useful high...

international conference on data mining | 2003

Protecting sensitive knowledge by data sanitization

Stanley Robson de Medeiros Oliveira; Osmar R. Zaïane

We address the problem of protecting some sensitive knowledge in transactional databases. The challenge is on protecting actionable knowledge for strategic decisions, but at the same time not losing the great benefit of association rule mining. To accomplish that, we introduce a new, efficient one-scan algorithm that meets privacy protection and accuracy in association rule mining, without putting at risk the effectiveness of the data mining per se.

european conference on principles of data mining and knowledge discovery | 2004

Mining positive and negative association rules: an approach for confined rules

Maria-Luiza Antonie; Osmar R. Zaïane

Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Many other applications would benefit from negative association rules if it was not for the expensive process to discover them. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, and while they were referred to in many publications, very few algorithms to mine them have been proposed to date. In this paper we propose an algorithm that extends the support-confidence framework with sliding correlation coefficient threshold. In addition to finding confident positive rules that have a strong correlation, the algorithm discovers negative association rules with strong negative correlation between the antecedents and consequents.

international conference on data mining | 2001

Fast parallel association rule mining without candidacy generation

Osmar R. Zaïane; Mohammad El-Hajj; Paul Lu

In this paper we introduce a new parallel algorithm MLFPT (multiple local frequent pattern tree) for parallel mining of frequent patterns, based on FP-growth mining, that uses only two full I/O scans of the database, eliminating the need for generating candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions.

international conference on management of data | 2004

An associative classifier based on positive and negative rules

Maria-Luiza Antonie; Osmar R. Zaïane

Associative classifiers use association rules to associate attribute values with observed class labels. This model has been recently introduced in the literature and shows good promise. The proposals so far have only concentrated on, and differ only in the way rules are ranked and selected in the model. We propose a new framework that uses different types of association rules, positive and negative. Negative association rules of interest are rules that either associate negations of attribute values to classes or negatively associate attribute values to classes. In this paper we propose a new algorithm to discover at the same time positive and negative association rules. We introduce a new associative classifier that takes advantage of these two types of rules. Moreover, we present a new way to prune irrelevant classification rules using a correlation coefficient without jeopardizing the accuracy of our associative classifier model. Our preliminary results with UCI datasets are very encouraging.

international database engineering and applications symposium | 2003

Algorithms for balancing privacy and knowledge discovery in association rule mining

Stanley Robson de Medeiros Oliveira; Osmar R. Zaïane

The discovery of association rules from large databases has proven beneficial for companies since such rules can be very effective in revealing actionable knowledge that leads to strategic decisions. In tandem with this benefit, association rule mining can also pose a threat to privacy protection. The main problem is that from non-sensitive information or unclassified data, one is able to infer sensitive information, including personal information, facts, or even patterns that are not supposed to be disclosed. This scenario reveals a pressing need for techniques that ensure privacy protection, while facilitating proper information accuracy and mining. In this paper, we introduce new algorithms for balancing privacy and knowledge discovery in association rule mining. We show that our algorithms require only two scans, regardless of the database size and the number of restrictive association rules that must be protected. Our performance study compares the effectiveness and scalability of the proposed algorithms and analyzes the fraction of association rules, which are preserved after sanitizing a database. We also report the main results of our performance evaluation and discuss some open research issues.

Explore More