Maria-Luiza Antonie
University of Alberta
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maria-Luiza Antonie.
international conference on data mining | 2002
Maria-Luiza Antonie; Osmar R. Zaïane
A good text classifier is a classifier that efficiently categorizes large sets of text documents in a reasonable time frame and with an acceptable accuracy, and that provides classification rules that are human readable for possible fine-tuning. If the training of the classifier is also quick, this could become in some application domains a good asset for the classifier. Many techniques and algorithms for automatic text categorization have been devised. According to published literature, some are more accurate than others, and some provide more interpretable classification models than others. However, none can combine all the beneficial properties enumerated above. In this paper we present a novel approach for automatic text categorization that borrows from market basket analysis techniques using association rule mining in the data-mining field. We focus on two major problems: (1) finding the best term association rules in a textual database by generating and pruning; and (2) using the rules to build a text classifier. Our text categorization method proves to be efficient and effective, and experiments on well-known collections show that the classifier performs well. In addition, training as well as classification are both fast and the generated rules are human readable.
european conference on principles of data mining and knowledge discovery | 2004
Maria-Luiza Antonie; Osmar R. Zaïane
Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Many other applications would benefit from negative association rules if it was not for the expensive process to discover them. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, and while they were referred to in many publications, very few algorithms to mine them have been proposed to date. In this paper we propose an algorithm that extends the support-confidence framework with sliding correlation coefficient threshold. In addition to finding confident positive rules that have a strong correlation, the algorithm discovers negative association rules with strong negative correlation between the antecedents and consequents.
international conference on management of data | 2004
Maria-Luiza Antonie; Osmar R. Zaïane
Associative classifiers use association rules to associate attribute values with observed class labels. This model has been recently introduced in the literature and shows good promise. The proposals so far have only concentrated on, and differ only in the way rules are ranked and selected in the model. We propose a new framework that uses different types of association rules, positive and negative. Negative association rules of interest are rules that either associate negations of attribute values to classes or negatively associate attribute values to classes. In this paper we propose a new algorithm to discover at the same time positive and negative association rules. We introduce a new associative classifier that takes advantage of these two types of rules. Moreover, we present a new way to prune irrelevant classification rules using a correlation coefficient without jeopardizing the accuracy of our associative classifier model. Our preliminary results with UCI datasets are very encouraging.
australasian database conference | 2002
Osmar R. Zaïane; Maria-Luiza Antonie
Automatic text categorization has always been an important application and research topic since the inception of digital documents. Today, text categorization is a necessity due to the very large amount of text documents that we have to deal with daily. Many techniques and algorithms for automatic text categorization have been devised and proposed in the literature. However, there is still much room for improving the effectiveness of these classifiers, and new models need to be examined. We propose herein a new approach for automatic text categorization. This paper explores the use of association rule mining in building a text categorization system and proposes a new fast algorithm for building a text classifier. Our approach has the advantage of a very fast training phase, and the rules of the classifier generated are easy to understand and manually tuneable. Our investigation leads to conclude that association rule mining is a good and promising strategy for efficient automatic text categorization.
pacific-asia conference on knowledge discovery and data mining | 2002
Maria-Luiza Antonie; Osmar R. Zaïane; Alexandru Coman
This paper presents two classification systems for medical images based on association rule mining. The system we propose consists of: a pre-processing phase, a phase for mining the resulted transactional database, and a final phase to organize the resulted association rules in a classification model. The experimental results show that the method performs well, reaching over 80% in accuracy. Moreover, this paper illustrates how important the data cleaning phase is in building an accurate data mining architecture for image classification.
international conference on knowledge based and intelligent information and engineering systems | 2005
Osmar R. Zaïane; Maria-Luiza Antonie
The integration of supervised classification and association rules for building classification models is not new. One major advantage is that models are human readable and can be edited. However, it is common knowledge that association rule mining typically yields a sheer number of rules defeating the purpose of a human readable model. Pruning unnecessary rules without jeopardizing the classification accuracy is paramount but very challenging. In this paper we study strategies for classification rule pruning in the case of associative classifiers.
international conference on data mining | 2006
Maria-Luiza Antonie; Osmar R. Zaïane; Robert C. Holte
Association rule-based classifiers have recently emerged as competitive classification systems. However, there are still deficiencies that hinder their performance. One deficiency is the use of rules in the classification stage. Current systems assign classes to new objects based on the best rule applied or on some predefined scoring of multiple rules. In this paper we propose a new technique where the system automatically learns how to use the rules. We achieve this by developing a two-stage classification model. First, we use association rule mining to discover classification rules. Second, we employ another learning algorithm to learn how to use these rules in the prediction process. Our two-stage approach outperforms C4.5 and RIPPER on the UCI datasets in our study, and outperforms other rule- learning methods on more than half the datasets. The versatility of our method is also demonstrated by applying it to text classification, where it equals the performance of the best known systems for this task, SVMs.
knowledge discovery and data mining | 2005
Rafal Rak; Wojciech Stach; Osmar R. Zaïane; Maria-Luiza Antonie
There are numerous different classification methods; among the many we can cite associative classifiers. This newly suggested model uses association rule mining to generate classification rules associating observed features with class labels. Given the binary nature of association rules, these classification models do not take into account repetition of features when categorizing. In this paper, we enhance the idea of associative classifiers with associations with re-occurring items and show that this mixture produces a good model for classification when repetition of observed features is relevant in the data mining application at hand.
Archive | 2001
Osmar R. Zaïane; Maria-Luiza Antonie
Discriminating between text articles and automatically classifying documents is an essential task for many applications. With the prevalence of digital documents and the wide use of e-mail and web documents, text categorization is regaining interest and is becoming a central problem in digital text collections. There have been many approaches to solve this problem, mainly from the machine learning community. This paper proposes a new fast method for building a text classifier using association rule mining by discovering associations between terms and topical categories of documents.
acm multimedia | 2001
Maria-Luiza Antonie; Osmar R. Zaïane; Alexandru Coman