Artur J. Ferreira
Instituto Superior de Engenharia de Lisboa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Artur J. Ferreira.
Pattern Recognition Letters | 2012
Artur J. Ferreira; Mário A. T. Figueiredo
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be computationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10^5 features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Pattern Recognition | 2012
Artur J. Ferreira; Mário A. T. Figueiredo
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.
Ensemble Machine Learning: Methods and Applications | 2012
Artur J. Ferreira; Mário A. T. Figueiredo
Boosting is a class of machine learning methods based on the idea that a combination of simple classifiers (obtained by a weak learner) can perform better than any of the simple classifiers alone. A weak learner (WL) is a learning algorithm capable of producing classifiers with probability of error strictly (but only slightly) less than that of random guessing (0.5, in the binary case). On the other hand, a strong learner (SL) is able (given enough training data) to yield classifiers with arbitrarily small error probability.
international conference on image processing | 2003
Artur J. Ferreira; Mário A. T. Figueiredo
This paper exploits independent component analysis (ICA) to obtain transform-based compression schemes adapted to specific image classes. This adaptation results from the data-dependent nature of the ICA bases, learnt from training images. Several coder architectures are evaluated and compared, according to both standard (SNR) and perceptual (picture quality scale - PQS) criteria, on two classes of images: faces and fingerprints. For fingerprint images, our coders perform close to the well-known special-purpose wavelet-based coder developed by the FBI. For face images, our ICA-based coders clearly outperform JPEG at the low bit-rates herein considered.
Signal Processing-image Communication | 2006
Artur J. Ferreira; Mário A. T. Figueiredo
Abstract This paper addresses the use of independent component analysis (ICA) for image compression. Our goal is to study the adequacy (for lossy transform compression) of bases learned from data using ICA. Since these bases are, in general, non-orthogonal, two methods are considered to obtain image representations: matching pursuit type algorithms and orthogonalization of the ICA bases followed by standard orthogonal projection. Several coder architectures are evaluated and compared, using both the usual SNR and a perceptual quality measure called picture quality scale . We consider four classes of images (natural, faces, fingerprints, and synthetic) to study the generalization and adaptation abilities of the data-dependent ICA bases. In this study, we have observed that: bases learned from natural images generalize well to other classes of images; bases learned from the other specific classes show good specialization. For example, for fingerprint images, our coders perform close to the special-purpose WSQ coder developed by the FBI. For some classes, the visual quality of the images obtained with our coders is similar to that obtained with JPEG2000, which is currently the state-of-the-art coder and much more sophisticated than a simple transform coder. We conclude that ICA provides a excellent tool for learning a coder for a specific image class, which can even be done using a single image from that class. This is an alternative to hand tailoring a coder for a given class (as was done, for example, in the WSQ for fingerprint images). Another conclusion is that a coder learned from natural images acts like an universal coder, that is, generalizes very well for a wide range of image classes.
Neurocomputing | 2014
Artur J. Ferreira; Mário A. T. Figueiredo
Discrete data representations are necessary, or at least convenient, in many machine learning problems. While feature selection (FS) techniques aim at finding relevant subsets of features, the goal of feature discretization (FD) is to find concise (quantized) data representations, adequate for the learning task at hand. In this paper, we propose two incremental methods for FD. The first method belongs to the filter family, in which the quality of the discretization is assessed by a (supervised or unsupervised) relevance criterion. The second method is a wrapper, where discretized features are assessed using a classifier. Both methods can be coupled with any static (unsupervised or supervised) discretization procedure and can be used to perform FS as pre-processing or post-processing stages. The proposed methods attain efficient representations suitable for binary and multi-class problems with different types of data, being competitive with existing methods. Moreover, using well-known FS methods with the features discretized by our techniques leads to better accuracy than with the features discretized by other methods or with the original features.
european conference on information retrieval | 2012
Tony Tam; Artur J. Ferreira; André Lourenço
Automatic organization of email messages into folders is both an open problem and challenge for machine learning techniques. Besides the effect of email overload, which affects many email users worldwide, there are some increasing difficulties caused by the semantics applied by each user. The varying number of folders and their meaning are personal and in many cases pose difficulties to learning methods. This paper addresses automatic organization of email messages into folders, based on supervised learning algorithms. The textual fields of the email message (subject and body) are considered for learning, with different representations, feature selection methods, and classifiers. The participant fields are embedded into a vector-space model representation. The classification decisions from the different email fields are combined by majority voting. Experiments on a subset of the Enron Corpus and on a private email data set show the significant improvement over both single classifiers on these fields as well as over previous works.
iberian conference on pattern recognition and image analysis | 2011
Artur J. Ferreira; Mário A. T. Figueiredo
In many applications, we deal with high dimensional datasets with different types of data. For instance, in text classification and information retrieval problems, we have large collections of documents. Each text is usually represented by a bag-of-words or similar representation, with a large number of features (terms). Many of these features may be irrelevant (or even detrimental) for the learning tasks. This excessive number of features carries the problem of memory usage in order to represent and deal with these collections, clearly showing the need for adequate techniques for feature representation, reduction, and selection, to both improve the classification accuracy and the memory requirements. In this paper, we propose a combined unsupervised feature discretization and feature selection technique. The experimental results on standard datasets show the efficiency of the proposed techniques as well as improvement over previous similar techniques.
2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718) | 2003
Artur J. Ferreira; Mário A. T. Figueiredo
In this paper we address the orthogonalization of independent component analysis (ICA) to obtain transform-based image coders. We consider several classes of training images, from which we extract the independent components, followed by orthogonalization, obtaining bases for image coding. Experimental tests show the generalization ability of ICA of natural images, and the adaptation ability to specific classes. The proposed fixed size block coders have lower transform complexity than JPEG. They outperform JPEG, on several classes of images, for a given range of compression ratios, according to both standard (SNR) and perceptual (picture quality scale - PQS) measures. For some image classes, the visual quality of the images obtained with our coders is similar to that obtained by JPEG2000, which is currently the state of the art still image coder. On fingerprint images, our fixed and variable size block coders perform competitively with the special-purpose wavelet-based coder developed by the FBI.
international conference on pattern recognition applications and methods | 2015
Artur J. Ferreira; Mário A. T. Figueiredo
Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.