Seppo Puuronen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Seppo Puuronen is active.

Explore More

Publication

Featured researches published by Seppo Puuronen.

Information Fusion | 2003

Ensemble feature selection with the simple Bayesian classification

Alexey Tsymbal; Seppo Puuronen; David W. Patterson

Abstract A popular method for creating an accurate classifier from a set of training data is to build several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. One way to generate an ensemble of accurate and diverse simple Bayesian classifiers is to use different feature subsets generated with the random subspace method. In this case, the ensemble consists of multiple classifiers constructed by randomly selecting feature subsets, that is, classifiers constructed in randomly chosen subspaces. In this paper, we present an algorithm for building ensembles of simple Bayesian classifiers in random subspaces. The EFS_SBC algorithm includes a hill-climbing-based refinement cycle, which tries to improve the accuracy and diversity of the base classifiers built on random feature subsets. We conduct a number of experiments on a collection of 21 real-world and synthetic data sets, comparing the EFS_SBC ensembles with the single simple Bayes, and with the boosted simple Bayes. In many cases the EFS_SBC ensembles have higher accuracy than the single simple Bayesian classifier, and than the boosted Bayesian ensemble. We find that the ensembles produced focusing on diversity have lower generalization error, and that the degree of importance of diversity in building the ensembles is different for different data sets. We propose several methods for the integration of simple Bayesian classifiers in the ensembles. In a number of cases the techniques for dynamic integration of classifiers have significantly better classification accuracy than their simple static analogues. We suggest that a reason for that is that the dynamic integration better utilizes the ensemble coverage than the static integration.

Information Fusion | 2008

Dynamic integration of classifiers for handling concept drift

Alexey Tsymbal; Mykola Pechenizkiy; Pádraig Cunningham; Seppo Puuronen

In the real world concepts are often not stable but change with time. A typical example of this in the biomedical context is antibiotic resistance, where pathogen sensitivity may change over time as new pathogen strains develop resistance to antibiotics that were previously effective. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques that treat arriving instances as equally important contributors to the final concept. The underlying data distribution may change as well, making previously built models useless. This is known as virtual concept drift. Both types of concept drifts make regular updates of the model necessary. Among the most popular and effective approaches to handle concept drift is ensemble learning, where a set of models built over different time periods is maintained and the best model is selected or the predictions of models are combined, usually according to their expertise level regarding the current concept. In this paper we propose the use of an ensemble integration technique that would help to better handle concept drift at an instance level. In dynamic integration of classifiers, each base classifier is given a weight proportional to its local accuracy with regard to the instance tested, and the best base classifier is selected, or the classifiers are integrated using weighted voting. Our experiments with synthetic data sets simulating abrupt and gradual concept drifts and with a real-world antibiotic resistance data set demonstrate that dynamic integration of classifiers built over small time intervals or fixed-sized data blocks can be significantly better than majority voting and weighted voting, which are currently the most commonly used integration techniques for handling concept drift with ensembles.

international syposium on methodologies for intelligent systems | 1999

A Dynamic Integration Algorithm for an Ensemble of Classifiers

Seppo Puuronen; Vagan Y. Terziyan; Alexey Tsymbal

One of the most important directions in improvement of the datamining and knowledge discovery methods is the integration of the multiple classification techniques based on ensembles of classifiers. An integration technique should solve the problem of estimation and selection of the most appropriate component classifiers for an ensemble. We discuss an advanced dynamic integration of multiple classifiers as one possible variation of the stacked generalization method using the assumption that each component classifier is best inside certain areas of the application domain. In the learning phase a performance matrix of each component classifier is derived and then used in the application phase to predict performances of each component classifier with new instances.

european conference on principles of data mining and knowledge discovery | 2000

Bagging and Boosting with Dynamic Integration of Classifiers

Alexey Tsymbal; Seppo Puuronen

One approach in classification tasks is to use machine learning techniques to derive classifiers using learning instances. The co-operation of several base classifiers as a decision committee has succeeded to reduce classification error. The main current decision committee learning approaches boosting and bagging use resampling with the training set and they can be used with different machine learning techniques which derive base classifiers. Boosting uses a kind of weighted voting and bagging uses equal weight voting as a combining method. Both do not take into account the local aspects that the base classifiers may have inside the problem space. We have proposed a dynamic integration technique to be used with ensembles of classifiers. In this paper, the proposed dynamic integration technique is applied with AdaBoost and bagging. The comparison results using several datasets of the UCI machine learning repository show that boosting and bagging with dynamic integration of classifiers results often better accuracy than boosting and bagging result with their original voting techniques.

International journal of continuing engineering education and life-long learning | 2007

Feedback adaptation in web-based learning systems

Ekaterina Vasilyeva; Seppo Puuronen; Mykola Pechenizkiy; Pekka Räsänen

Feedback provided by a learning system to its users plays an important role in web-based education. This paper presents an overview of feedback studies and then concentrates on the problem of feedback adaptation in web-based learning systems. We introduce our taxonomy of feedback concept with regard to its functions, complexity, intention, time of occurrence, way of presentation, and level and way of its adaptation. We consider what can be adapted in feedback and how to facilitate feedback adaptation in web-based learning systems.

computer-based medical systems | 2006

Class Noise and Supervised Learning in Medical Domains: The Effect of Feature Extraction

Mykola Pechenizkiy; Alexey Tsymbal; Seppo Puuronen; Oleksandr Pechenizkiy

Inductive learning systems have been successfully applied in a number of medical domains. It is generally accepted that the highest accuracy results that an inductive learning system can achieve depend on the quality of data and on the appropriate selection of a learning algorithm for the data. In this paper we analyze the effect of class noise on supervised learning in medical domains. We review the related work on learning from noisy data and propose to use feature extraction as a pre-processing step to diminish the effect of class noise on the learning process. Our experiments with 8 medical datasets show that feature extraction indeed helps to deal with class noise. It clearly results in higher classification accuracy of learnt models without the separate explicit elimination of noisy instances

computer-based medical systems | 2006

Handling Local Concept Drift with Dynamic Integration of Classifiers: Domain of Antibiotic Resistance in Nosocomial Infections

Alexey Tsymbal; Mykola Pechenizkiy; Pádraig Cunningham; Seppo Puuronen

In the real world concepts and data distributions are often not stable but change with time. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques, which treat arriving instances as equally important contributors to the target concept. Among the most popular and effective approaches to handle concept drift is ensemble learning, where a set of models built over different time periods is maintained and the best model is selected or the predictions of models are combined. In this paper we consider the use of an ensemble integration technique that helps to better handle concept drift at the instance level. Our experiments with real-world antibiotic resistance data demonstrate that dynamic integration of classifiers built over small time intervals can be more effective than globally weighted voting which is currently the most commonly used integration approach for handling concept drift with ensembles

computer-based medical systems | 2005

Towards the framework of adaptive user interfaces for eHealth

Ekaterina Vasilyeva; Mykola Pechenizkiy; Seppo Puuronen

Diversity inside a group of users having their individual abilities, interests, and needs challenge the developers of eHealth projects with heterogeneous needs in information delivery and/or other eHealth services. This paper considers an adaptive user interface approach as an opportunity in addressing this challenge. We briefly overview the recent achievements in the area of user interface adaptation and discuss application of these achievements in the eHealth context. We introduce the basic elements of our framework for adaptive user interface (AUI) for eHealth systems. Then, we use this framework in our review of work in the area of AUI for eHealth applications. As a result, we conclude with a brief discussion on the current focus on AUI research in eHealth, and interesting directions for further research.

International Journal of Medical Informatics | 1998

The decision support system for telemedicine based on multiple expertise

Vagan Y. Terziyan; Alexey Tsymbal; Seppo Puuronen

This paper discusses the application of artificial intelligence in telemedicine and some of our research results in this area. The main goal of our research is to develop methods and systems to collect, analyse, distribute and use medical diagnostics knowledge from multiple knowledge sources and areas of expertise. Use of modern communication tools enable a physician to collect and analyse information obtained from experts worldwide with the help of a decision support medical system. In this paper we discuss a multilevel representation and processing of medical data using a system which evaluates and exploits knowledge about the behaviour of statistical diagnostics methods. The presented technique is able to acquire semantically-essential information from the complex dynamics of quasi-periodical medical signals by applying recursively-ordinary statistical tools. A method and an algorithm are elaborated to select automatically the most appropriate diagnostics method for each case under consideration. We suggest the use of a voting-type technique to search for consensus among the different opinions of medical experts. Research results can be applied in the development of a telediagnostics expert medical system and medical teleconsulting support system.

computer based medical systems | 2003

Search strategies for ensemble feature selection in medical diagnostics

Alexey Tsymbal; Pádraig Cunningham; Mykola Pechenizkiy; Seppo Puuronen

The goal of this paper is to propose, evaluate, and compare four search strategies for ensemble feature selection, and to consider their application to medical diagnostics, with a focus on the problem of the classification of acute abdominal pain. Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to get higher accuracy, sensitivity, and specificity, which are often not achievable with single models. One technique, which proved to be effective for ensemble construction, is feature selection. Lately, several strategies for ensemble feature selection were proposed, including random subspacing, hill-climbing-based search, and genetic search. In this paper, we propose two new sequential-search-based strategies for ensemble feature selection, and evaluate them, constructing ensembles of simple Bayesian classifiers for the problem of acute abdominal pain classification. We compare the search strategies with regard to achieved accuracy, sensitivity, specificity, and the average number of features they select.

Explore More