Pavel Brazdil
University of Porto
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pavel Brazdil.
Machine Learning | 2003
Pavel Brazdil; Carlos Soares; Joaquim Pinto da Costa
We present a meta-learning method to support selection of candidate learning algorithms. It uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand. The distance between datasets is assessed using a relatively small set of data characteristics, which was selected to represent properties that affect algorithm performance. The performance of the candidate algorithms on those datasets is used to generate a recommendation to the user in the form of a ranking. The performance is assessed using a multicriteria evaluation measure that takes not only accuracy, but also time into account. As it is not common in Machine Learning to work with rankings, we had to identify and adapt existing statistical techniques to devise an appropriate evaluation methodology. Using that methodology, we show that the meta-learning method presented leads to significantly better rankings than the baseline ranking method. The evaluation methodology is general and can be adapted to other ranking problems. Although here we have concentrated on ranking classification algorithms, the meta-learning framework presented can provide assistance in the selection of combinations of methods or more complex problem solving strategies.
Archive | 2005
Alípio Mário Jorge; Luís Torgo; Pavel Brazdil; Rui Camacho; João Gama
Invited Talks.- Data Analysis in the Life Sciences - Sparking Ideas -.- Machine Learning for Natural Language Processing (and Vice Versa?).- Statistical Relational Learning: An Inductive Logic Programming Perspective.- Recent Advances in Mining Time Series Data.- Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce.- Data Streams and Data Synopses for Massive Data Sets.- Long Papers.- k-Anonymous Patterns.- Interestingness is Not a Dichotomy: Introducing Softness in Constrained Pattern Mining.- Generating Dynamic Higher-Order Markov Models in Web Usage Mining.- Tree 2 - Decision Trees for Tree Structured Data.- Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results.- Cluster Aggregate Inequality and Multi-level Hierarchical Clustering.- Ensembles of Balanced Nested Dichotomies for Multi-class Problems.- Protein Sequence Pattern Mining with Constraints.- An Adaptive Nearest Neighbor Classification Algorithm for Data Streams.- Support Vector Random Fields for Spatial Classification.- Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication.- A Correspondence Between Maximal Complete Bipartite Subgraphs and Closed Patterns.- Improving Generalization by Data Categorization.- Mining Model Trees from Spatial Data.- Word Sense Disambiguation for Exploiting Hierarchical Thesauri in Text Classification.- Mining Paraphrases from Self-anchored Web Sentence Fragments.- M2SP: Mining Sequential Patterns Among Several Dimensions.- A Systematic Comparison of Feature-Rich Probabilistic Classifiers for NER Tasks.- Knowledge Discovery from User Preferences in Conversational Recommendation.- Unsupervised Discretization Using Tree-Based Density Estimation.- Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization.- Non-stationary Environment Compensation Using Sequential EM Algorithm for Robust Speech Recognition.- Hybrid Cost-Sensitive Decision Tree.- Characterization of Novel HIV Drug Resistance Mutations Using Clustering, Multidimensional Scaling and SVM-Based Feature Ranking.- Object Identification with Attribute-Mediated Dependences.- Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids.- Using Inductive Logic Programming for Predicting Protein-Protein Interactions from Multiple Genomic Data.- ISOLLE: Locally Linear Embedding with Geodesic Distance.- Active Sampling for Knowledge Discovery from Biomedical Data.- A Multi-metric Index for Euclidean and Periodic Matching.- Fast Burst Correlation of Financial Data.- A Propositional Approach to Textual Case Indexing.- A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston.- Efficient Classification from Multiple Heterogeneous Databases.- A Probabilistic Clustering-Projection Model for Discrete Data.- Short Papers.- Collaborative Filtering on Data Streams.- The Relation of Closed Itemset Mining, Complete Pruning Strategies and Item Ordering in Apriori-Based FIM Algorithms.- Community Mining from Multi-relational Networks.- Evaluating the Correlation Between Objective Rule Interestingness Measures and Real Human Interest.- A Kernel Based Method for Discovering Market Segments in Beef Meat.- Corpus-Based Neural Network Method for Explaining Unknown Words by WordNet Senses.- Segment and Combine Approach for Non-parametric Time-Series Classification.- Producing Accurate Interpretable Clusters from High-Dimensional Data.- Stress-Testing Hoeffding Trees.- Rank Measures for Ordering.- Dynamic Ensemble Re-Construction for Better Ranking.- Frequency-Based Separation of Climate Signals.- Efficient Processing of Ranked Queries with Sweeping Selection.- Feature Extraction from Mass Spectra for Classification of Pathological States.- Numbers in Multi-relational Data Mining.- Testing Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation.- Improved Naive Bayes for Extremely Skewed Misclassification Costs.- Clustering and Prediction of Mobile User Routes from Cellular Data.- Elastic Partial Matching of Time Series.- An Entropy-Based Approach for Generating Multi-dimensional Sequential Patterns.- Visual Terrain Analysis of High-Dimensional Datasets.- An Auto-stopped Hierarchical Clustering Algorithm for Analyzing 3D Model Database.- A Comparison Between Block CEM and Two-Way CEM Algorithms to Cluster a Contingency Table.- An Imbalanced Data Rule Learner.- Improvements in the Data Partitioning Approach for Frequent Itemsets Mining.- On-Line Adaptive Filtering of Web Pages.- A Bi-clustering Framework for Categorical Data.- Privacy-Preserving Collaborative Filtering on Vertically Partitioned Data.- Indexed Bit Map (IBM) for Mining Frequent Sequences.- STochFS: A Framework for Combining Feature Selection Outcomes Through a Stochastic Process.- Speeding Up Logistic Model Tree Induction.- A Random Method for Quantifying Changing Distributions in Data Streams.- Deriving Class Association Rules Based on Levelwise Subspace Clustering.- An Incremental Algorithm for Mining Generators Representation.- Hybrid Technique for Artificial Neural Network Architecture and Weight Optimization.
Machine Learning archive | 2000
João Gama; Pavel Brazdil
Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present two related methods for merging classifiers. The first method, Cascade Generalization, couples classifiers loosely. It belongs to the family of stacking algorithms. The basic idea of Cascade Generalization is to use sequentially the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. The second method exploits tight coupling of classifiers, by applying Cascade Generalization locally. At each iteration of a divide and conquer algorithm, a reconstruction of the instance space occurs by the addition of new attributes. Each new attribute represents the probability that an example belongs to a class given by a base classifier. We have implemented three Local Generalization Algorithms. The first merges a linear discriminant with a decision tree, the second merges a naive Bayes with a decision tree, and the third merges a linear discriminant and a naive Bayes with a decision tree. All the algorithms show an increase of performance, when compared with the corresponding single models. Cascade also outperforms other methods for combining classifiers, like Stacked Generalization, and competes well against Boosting at statistically significant confidence levels.
Machine Learning | 2004
Christophe G. Giraud-Carrier; Ricardo Vilalta; Pavel Brazdil
Recent advances in meta-learning are providing the foundations to construct meta-learning assistants and task-adaptive learners. The goal of this special issue is to foster an interest in meta-learning by compiling representative work in the field. The contributions to this special issue provide strong insights into the construction of future meta-learning tools. In this introduction we present a common frame of reference to address work in meta-learning through the concept of meta-knowledge. We show how meta-learning can be simply defined as the process of exploiting knowledge about learning that enables us to understand and improve the performance of learning algorithms.
Machine Learning | 2004
Carlos Soares; Pavel Brazdil; Petr Kuba
The Support Vector Machine algorithm is sensitive to the choice of parameter settings. If these are not set correctly, the algorithm may have a substandard performance. Suggesting a good setting is thus an important problem. We propose a meta-learning methodology for this purpose and exploit information about the past performance of different settings. The methodology is applied to set the width of the Gaussian kernel. We carry out an extensive empirical evaluation, including comparisons with other methods (fixed default ranking; selection based on cross-validation and a heuristic method commonly used to set the width of the SVM kernel). We show that our methodology can select settings with low error while providing significant savings in time. Further work should be carried out to see how the methodology could be adapted to different parameter setting tasks.
european conference on machine learning | 1994
Pavel Brazdil; João Gama; Bob Henery
This paper is concerned with a comparative study of different machine learning, statistical and neural algorithms and an automatic analysis of test results. It is shown that machine learning methods themselves can be used in organizing this knowledge. Various datasets can be characterized using different statistical and information theoretic measures. These together with the test results can be used by a ML system to generate a set of rules which could also be altered or edited by the user. The system can be applied to a new dataset to provide the user with a set of recommendations concerning the suitability of different algorithms and these are graded by an appropriate information score. The experiments with the implemented system indicate that the method is viable and useful.
european conference on machine learning | 2000
Pavel Brazdil; Carlos Soares
We investigate the problem of using past performance information to select an algorithm for a given classification problem. We present three ranking methods for that purpose: average ranks, success rate ratios and significant wins. We also analyze the problem of evaluating and comparing these methods. The evaluation technique used is based on a leave-one-out procedure. On each iteration, the method generates a ranking using the results obtained by the algorithms on the training datasets. This ranking is then evaluated by calculating its distance from the ideal ranking built using the performance information on the test dataset. The distance measure adopted here, average correlation, is based on Spearmans rank correlation coefficient. To compare ranking methods, a combination of Friedmans test and Dunns multiple comparison procedure is adopted. When applied to the methods presented here, these tests indicate that the success rate ratios and average ranks methods perform better than significant wins.
portuguese conference on artificial intelligence | 1995
João Gama; Pavel Brazdil
This paper is concerned with the problem of characterization of classification algorithms. The aim is to determine under what circumstances a particular classification algorithm is applicable. The method used involves generation of different kinds of models. These include regression and rule models, piecewise linear models (model trees) and instance based models. These are generated automatically on the basis of dataset characteristics and given test results. The lack of data is compensated for by various types of preprocessing. The models obtained are characterized by quantifying their predictive capability and the best models are identified.
discovery science | 2002
Yonghong Peng; Peter A. Flach; Carlos Soares; Pavel Brazdil
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in meta-learning, and Landmarking that is the most recently developed method.
international conference on artificial intelligence in theory and practice | 2006
Fabrice Colas; Pavel Brazdil
Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM ?