Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tapio Elomaa is active.

Publication


Featured researches published by Tapio Elomaa.


Machine Learning | 1999

General and Efficient Multisplitting of Numerical Attributes

Tapio Elomaa; Juho Rousu

Often in supervised learning numerical attributes require special treatment and do not fit the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, a property that guarantees the optimal multi-partition of an arbitrary numerical domain to be defined on boundary points. Well-behavedness reduces the number of candidate cut points that need to be examined in multisplitting numerical attributes. Many commonly used attribute evaluation functions possess this property; we demonstrate that the cumulative functions Information Gain and Training Set Error as well as the non-cumulative functions Gain Ratio and Normalized Distance Measure are all well-behaved. We also devise a method of finding optimal multisplits efficiently by examining the minimum number of boundary point combinations that is required to produce partitions which are optimal with respect to a cumulative and well-behaved evaluation function. Our empirical experiments validate the utility of optimal multisplitting: it produces constantly better partitions than alternative approaches do and it only requires comparable time. In top-down induction of decision trees the choice of evaluation function has a more decisive effect on the result than the choice of partitioning strategy; optimizing the value of most common attribute evaluation functions does not raise the accuracy of the produced decision trees. In our tests the construction time using optimal multisplitting was, on the average, twice that required by greedy multisplitting, which in its part required on the average twice the time of binary splitting.


Journal of Artificial Intelligence Research | 2001

An analysis of reduced error pruning

Tapio Elomaa; Matti Kääriäinen

Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve. Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems of decision tree learning. n nIn this paper we present analyses of Reduced Error Pruning in three different settings. First we study the basic algorithmic properties of the method, properties that hold independent of the input decision tree and pruning examples. Then we examine a situation that intuitively should lead to the subtree under consideration to be replaced by a leaf node, one in which the class label and attribute values of the pruning examples are independent of each other. This analysis is conducted under two different assumptions. The general analysis shows that the pruning probability of a node fitting pure noise is bounded by a function that decreases exponentially as the size of the tree grows. In a specific analysis we assume that the examples are distributed uniformly to the tree. This assumption lets us approximate the number of subtrees that are pruned because they do not receive any pruning examples. n nThis paper clarifies the different variants of the Reduced Error Pruning algorithm, brings new insight to its algorithmic properties, analyses the algorithm with less imposed assumptions than before, and includes the previously overlooked empty subtrees to the analysis.


Data Mining and Knowledge Discovery | 2004

Efficient Multisplitting Revisited: Optima-Preserving Elimination of Partition Candidates

Tapio Elomaa; Juho Rousu

We consider multisplitting of numerical value ranges, a task that is encountered as a discretization step preceding induction and also embedded into learning algorithms. We are interested in finding the partition that optimizes the value of a given attribute evaluation function. For most commonly used evaluation functions this task takes quadratic time in the number of potential cut points in the numerical range. Hence, it is a potential bottleneck in data mining algorithms.We present two techniques that speed up the optimal multisplitting task. The first one aims at discarding cut point candidates in a quick linear-time preprocessing scan before embarking on the actual search. We generalize the definition of boundary points by Fayyad and Irani to allow us to merge adjacent example blocks that have the same relative class distribution. We prove for several commonly used evaluation functions that this processing removes only suboptimal cut points. Hence, the algorithm does not lose optimality.Our second technique tackles the quadratic-time dynamic programming algorithm, which is the best schema for optimizing many well-known evaluation functions. We present a technique that dynamically—i.e., during the search—prunes partitions of prefixes of the sorted data from the search space of the algorithm. The method works for all convex and cumulative evaluation functions.Together the use of these two techniques speeds up the multisplitting process considerably. Compared to the baseline dynamic programming algorithm the speed-up is around 50 percent on the average and up to 90 percent in some cases. We conclude that optimal multisplitting is fully feasible on all benchmark data sets we have encountered.


Pattern Recognition | 2003

Flexible view recognition for indoor navigation based on Gabor filters and support vector machines

Ilkka Autio; Tapio Elomaa

Abstract Real-world scenes are hard to segment into (relevant) objects and (irrelevant) background. In this paper, we argue for view-based vision, which does not use segmentation, and demonstrate a practical approach for recognizing textured objects and scenes in office environments. A small set of Gabor filters is used to preprocess texture combinations from input images. The impulse responses of the filters are transformed into feature vectors that are fed to support vector machines. Pairwise feature comparisons decide the classification of views. We validate the approach on a robot platform using three different types of target objects and indoor scenes: people, doorways, and written signs. The general-purpose system can run in real time, and that recognition accuracies of practical utility are obtained.


intelligent data analysis | 1999

The Biases of Decision Tree Pruning Strategies

Tapio Elomaa

Post pruning of decision trees has been a successful approach in many real-world experiments, but over all possible concepts it does not bring any inherent improvement to an algorithms performance. This work explores how a PAC-proven decision tree learning algorithm fares in comparison with two variants of the normal top-down induction of decision trees. The algorithm does not prune its hypothesis per se, but it can be understood to do pre-pruning of the evolving tree. We study a backtracking search algorithm, called Rank, for learning rank-minimal decision trees. Our experiments follow closely those performed by Schaffer [20]. They confirm the main findings of Schaffer: in learning concepts with simple description pruning works, for concepts with a complex description and when all concepts are equally likely pruning is injurious, rather than beneficial, to the average performance of the greedy top-down induction of decision trees. Pre-pruning, as a gentler technique, settles in the average performance in the middle ground between not pruning at all and post pruning.


european conference on machine learning | 1994

A geometric approach to feature selection

Tapio Elomaa; Esko Ukkonen

We propose a new method for selecting features, or deciding on splitting points in inductive learning. Its main innovation is to take the positions of examples into account instead of just considering the numbers of examples from different classes that fall at different sides of a splitting rule. The method gives rise to a family of feature selection techniques. We demonstrate the promise of the developed method with initial empirical experiments in connection of top-down induction of decision trees.


Knowledge and Information Systems | 2003

Necessary and Sufficient Pre-processing in Numerical Range Discretization

Tapio Elomaa; Juho Rousu

Abstract.The time complexities of class-driven numerical range discretization algorithmsndepend on the number of cut point candidates. Previous analysis has shown that only ansubset of all cut points - the segment borders - have to be taken into account in optimalndiscretization with respect to many goodness criteria. In this paper we show that inspectingnsegment borders alone suffices in optimizing any convex evaluation function. For strictlynconvex evaluation functions inspecting all of them also is necessary, since the placementnof neighboring cut points affects the optimality of a segment border. With the training setnerror function, which is not strictly convex, it suffices to inspect an even smaller set ofncut point candidates, called alternations, when striving for optimal partition. On the othernhand, we prove that failing to check an alternation may lead to suboptimal discretization.nWe present a linear-time algorithm for finding all alternation points. The number ofnalternation points is typically much lower than the total number of cut points. In ournexperiments running the discretization algorithm over the sequence of alternation pointsnled to a significant speed-up.


discovery science | 2006

A voronoi diagram approach to autonomous clustering

Heidi Koivistoinen; Minna Ruuska; Tapio Elomaa

Clustering is a basic tool in unsupervised machine learning and data mining. Distance-based clustering algorithms rarely have the means to autonomously come up with the correct number of clusters from the data. A recent approach to identifying the natural clusters is to compare the point densities in different parts of the sample space. n nIn this paper we put forward an agglomerative clustering algorithm which accesses density information by constructing a Voronoi diagram for the input sample. The volumes of the point cells directly reflect the point density in the respective parts of the instance space. Scanning through the input points and their Voronoi cells once, we combine the densest parts of the instance space into clusters. n nOur empirical experiments demonstrate the proposed algorithm is able to come up with a high-accuracy clustering for many different types of data. The Voronoi approach clearly outperforms k-means algorithm on data conforming to its underlying assumptions.


Algorithms and Applications | 2010

Covering analysis of the greedy algorithm for partial cover

Tapio Elomaa; Jussi Kujala

The greedy algorithm is known to have a guaranteed approximation performance in many variations of the well-known minimum set cover problem. We analyze the number of elements covered by the greedy algorithm for the minimum set cover problem, when executed for k rounds. This analysis quite easily yields in the p-partial cover problem over a ground set of m elements the harmonic approximation guarantee H(⌈pm⌉) for the number of required covering sets. Thus, we tie together the coverage analysis of the greedy algorithm for minimum set cover and its dual problem partial cover.


international conference on data mining | 2009

A Walk from 2-Norm SVM to 1-Norm SVM

Jussi Kujala; Timo Aho; Tapio Elomaa

This paper studies how useful the standard 2-norm regularized SVM is in approximating the 1-norm SVM problem. To this end, we examine a general method that is based on iteratively re-weighting the features and solving a 2-norm optimization problem. The convergence rate of this method is unknown. Previous work indicates that it might require an excessive number of iterations. We study how well we can do with just a small number of iterations. In theory the convergence rate is fast, except for coordinates of the current solution that are close to zero. Our empirical experiments confirm this. In many problems with irrelevant features, already one iteration is often enough to produce accuracy as good as or better than that of the 1-norm SVM. Hence, it seems that in these problems we do not need to converge to the 1-norm SVM solution near zero values. The benefit of this approach is that we can build something similar to the 1-norm regularized solver based on any 2-norm regularized solver. This is quick to implement and the solution inherits the good qualities of the solver such as scalability and stability.

Collaboration


Dive into the Tapio Elomaa's collaboration.

Top Co-Authors

Avatar

Juho Rousu

University of Helsinki

View shared research outputs
Top Co-Authors

Avatar

Jussi Kujala

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matti Saarela

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar

Petri Lehtinen

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar

Timo Aho

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Teemu J. Heinimäki

Tampere University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge