Marina Skurichina | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marina Skurichina is active.

Explore More

Publication

Featured researches published by Marina Skurichina.

Pattern Analysis and Applications | 2002

Bagging, Boosting and the Random Subspace Method for Linear Classifiers

Marina Skurichina; Robert P. W. Duin

Abstract: Recently bagging, boosting and the random subspace method have become popular combining techniques for improving weak classifiers. These techniques are designed for, and usually applied to, decision trees. In this paper, in contrast to a common opinion, we demonstrate that they may also be useful in linear discriminant analysis. Simulation studies, carried out for several artificial and real data sets, show that the performance of the combining techniques is strongly affected by the small sample size properties of the base classifier: boosting is useful for large training sample sizes, while bagging and the random subspace method are useful for critical training sample sizes. Finally, a table describing the possible usefulness of the combining techniques for linear classifiers is presented.

Information Fusion | 2002

An experimental study on diversity for bagging and boosting with linear classifiers

Ludmila I. Kuncheva; Marina Skurichina; Robert P. W. Duin

Abstract In classifier combination, it is believed that diverse ensembles have a better potential for improvement on the accuracy than non-diverse ensembles. We put this hypothesis to a test for two methods for building the ensembles: Bagging and Boosting, with two linear classifier models: the nearest mean classifier and the pseudo-Fisher linear discriminant classifier. To estimate diversity, we apply nine measures proposed in the recent literature on combining classifiers. Eight combination methods were used: minimum, maximum, product, average, simple majority, weighted majority, Naive Bayes and decision templates. We carried out experiments on seven data sets for different sample sizes, different number of classifiers in the ensembles, and the two linear classifiers. Altogether, we created 1364 ensembles by the Bagging method and the same number by the Boosting method. On each of these, we calculated the nine measures of diversity and the accuracy of the eight different combination methods, averaged over 50 runs. The results confirmed in a quantitative way the intuitive explanation behind the success of Boosting for linear classifiers for increasing training sizes, and the poor performance of Bagging in this case. Diversity measures indicated that Boosting succeeds in inducing diversity even for stable classifiers whereas Bagging does not.

Pattern Recognition | 1998

BAGGING FOR LINEAR CLASSIFIERS

Marina Skurichina; Robert P. W. Duin

Abstract Classifiers built on small training sets are usually biased or unstable. Different techniques exist to construct more stable classifiers. It is not clear which ones are good, and whether they really stabilize the classifier or just improve the performance. In this paper bagging (bootstrapping and aggregating) [L. Breiman, Bagging predictors, Machine Learning J . 24 (2), 123–140 (1996)] is studied for a number of linear classifiers. A measure for the instability of classifiers is introduced. The influence of regularization and bagging on this instability and the generalization error of linear classifiers is investigated. In a simulation study it is shown that in general bagging is not a stabilizing technique. It is also demonstrated that one can consider the instability of the classifier to predict how useful bagging will be. Finally, it is shown experimentally that bagging might improve the performance of the classifier only for very unstable situations.

Journal of Biomedical Optics | 2004

Clinical study for classification of benign, dysplastic, and malignant oral lesions using autofluorescence spectroscopy

Diana C.G. de Veld; Marina Skurichina; Max J. H. Witjes; Robert P. W. Duin; Henricus J. C. M. Sterenborg; Jan Roodenburg

Autofluorescence spectroscopy shows promising results for detection and staging of oral (pre-)malignancies. To improve staging reliability, we develop and compare algorithms for lesion classification. Furthermore, we examine the potential for detecting invisible tissue alterations. Autofluorescence spectra are recorded at six excitation wavelengths from 172 benign, dysplastic, and cancerous lesions and from 97 healthy volunteers. We apply principal components analysis (PCA), artificial neural networks, and red/green intensity ratios to separate benign from (pre-)malignant lesions, using four normalization techniques. To assess the potential for detecting invisible tissue alterations, we compare PC scores of healthy mucosa and surroundings/contralateral positions of lesions. The spectra show large variations in shape and intensity within each lesion group. Intensities and PC score distributions demonstrate large overlap between benign and (pre-)malignant lesions. The receiver-operator characteristic areas under the curve (ROC-AUCs) for distinguishing cancerous from healthy tissue are excellent (0.90 to 0.97). However, the ROC-AUCs are too low for classification of benign versus (pre-)malignant mucosa for all methods (0.50 to 0.70). Some statistically significant differences between surrounding/contralateral tissues of benign and healthy tissue and of (pre-)malignant lesions are observed. We can successfully separate healthy mucosa from cancers (ROC-AUC>0.9). However, autofluorescence spectroscopy is not able to distinguish benign from visible (pre-)malignant lesions using our methods (ROC-AUC<0.65). The observed significant differences between healthy tissue and surroundings/contralateral positions of lesions might be useful for invisible tissue alteration detection.

IEEE Transactions on Neural Networks | 2000

k-nearest neighbors directed noise injection in multilayer perceptron training

Marina Skurichina; Sarunas Raudys; Robert P. W. Duin

The relation between classifier complexity and learning set size is very important in discriminant analysis. One of the ways to overcome the complexity control problem is to add noise to the training objects, increasing in this way the size of the training set. Both the amount and the directions of noise injection are important factors which determine the effectiveness for classifier training. In this paper the effect is studied of the injection of Gaussian spherical noise and -nearest neighbors directed noise on the performance of multilayer perceptrons. As it is impossible to provide an analytical investigation for multilayer perceptrons, a theoretical analysis is made for statistical classifiers. The goal is to get a better understanding of the effect of noise injection on the accuracy of sample-based classifiers. By both empirical as well as theoretical studies, it is shown that the -nearest neighbors directed noise injection is preferable over the Gaussian spherical noise injection for data with low intrinsic dimensionality.

multiple classifier systems | 2001

Bagging and the Random Subspace Method for Redundant Feature Spaces

Marina Skurichina; Robert P. W. Duin

The performance of a single weak classifier can be improved by using combining techniques such as bagging, boosting and the random subspace method. When applying them to linear discriminant analysis, it appears that they are useful in different situations. Their performance is strongly affected by the choice of the base classifier and the training sample size. As well, their usefulness depends on the data distribution. In this paper, on the example of the pseudo Fisher linear classifier, we study the effect of the redundancy in the data feature set on the performance of the random subspace method and bagging.

international conference on multiple classifier systems | 2005

Combining feature subsets in feature selection

Marina Skurichina; Robert P. W. Duin

In feature selection, a part of the features is chosen as a new feature subset, while the rest of the features is ignored. The neglected features still, however, may contain useful information for discriminating the data classes. To make use of this information, the combined classifier approach can be used. In our paper we study the efficiency of combining applied on top of feature selection/extraction. As well, we analyze conditions when combining classifiers on multiple feature subsets is more beneficial than exploiting a single selected feature set.

multiple classifier systems | 2002

Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

Marina Skurichina; Liudmila I. Kuncheva; Robert P. W. Duin

In combining classifiers, it is believed that diverse ensembles perform better than non-diverse ones. In order to test this hypothesis, we study the accuracy and diversity of ensembles obtained in bagging and boosting applied to the nearest mean classifier. In our simulation study we consider two diversity measures: the Q statistic and the disagreement measure. The experiments, carried out on four data sets have shown that both diversity and the accuracy of the ensembles depend on the training sample size. With exception of very small training sample sizes, both bagging and boosting are more useful when ensembles consist of diverse classifiers. However, in boosting the relationship between diversity and the efficiency of ensembles is much stronger than in bagging.

multiple classifier systems | 2000

Boosting in Linear Discriminant Analysis

Marina Skurichina; Robert P. W. Duin

In recent years, together with bagging [5] and the random subspace method [15], boosting [6] became one of the most popular combining techniques that allows us to improve a weak classifier. Usually, boosting is applied to Decision Trees (DTs). In this paper, we study boosting in Linear Discriminant Analysis (LDA). Simulation studies, carried out for one artificial data set and two real data sets, show that boosting might be useful in LDA for large training sample sizes while bagging is useful for critical training sample sizes [11]. In this paper, in contrast to a common opinion, we demonstrate that the usefulness of boosting does not depend on the instability of a classifier.

Lecture Notes in Computer Science | 2000

The Role of Combining Rules in Bagging and Boosting

Marina Skurichina; Robert P. W. Duin

To improve weak classifiers bagging and boosting could be used. These techniques are based on combining classifiers. Usually, a simple majority vote or a weighted majority vote are used as combining rules in bagging and boosting. However, other combining rules such as mean, product and average are possible. In this paper, we study bagging and boosting in Linear Discriminant Analysis (LDA) and the role of combining rules in bagging and boosting. Simulation studies, carried out for two artificial data sets and one real data set, show that bagging and boosting might be useful in LDA: bagging for critical training sample sizes and boosting for large training sample sizes. In contrast to a common opinion, we demonstrate that the usefulness of boosting does not directly depend on the instability of a classifier. It is also shown that the choice of the combining rule may affect the performance of bagging and boosting.

Explore More