Tapio Elomaa
Tampere University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tapio Elomaa.
Machine Learning | 1999
Tapio Elomaa; Juho Rousu
Often in supervised learning numerical attributes require special treatment and do not fit the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, a property that guarantees the optimal multi-partition of an arbitrary numerical domain to be defined on boundary points. Well-behavedness reduces the number of candidate cut points that need to be examined in multisplitting numerical attributes. Many commonly used attribute evaluation functions possess this property; we demonstrate that the cumulative functions Information Gain and Training Set Error as well as the non-cumulative functions Gain Ratio and Normalized Distance Measure are all well-behaved. We also devise a method of finding optimal multisplits efficiently by examining the minimum number of boundary point combinations that is required to produce partitions which are optimal with respect to a cumulative and well-behaved evaluation function. Our empirical experiments validate the utility of optimal multisplitting: it produces constantly better partitions than alternative approaches do and it only requires comparable time. In top-down induction of decision trees the choice of evaluation function has a more decisive effect on the result than the choice of partitioning strategy; optimizing the value of most common attribute evaluation functions does not raise the accuracy of the produced decision trees. In our tests the construction time using optimal multisplitting was, on the average, twice that required by greedy multisplitting, which in its part required on the average twice the time of binary splitting.
Journal of Artificial Intelligence Research | 2001
Tapio Elomaa; Matti Kääriäinen
Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve. Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems of decision tree learning. n nIn this paper we present analyses of Reduced Error Pruning in three different settings. First we study the basic algorithmic properties of the method, properties that hold independent of the input decision tree and pruning examples. Then we examine a situation that intuitively should lead to the subtree under consideration to be replaced by a leaf node, one in which the class label and attribute values of the pruning examples are independent of each other. This analysis is conducted under two different assumptions. The general analysis shows that the pruning probability of a node fitting pure noise is bounded by a function that decreases exponentially as the size of the tree grows. In a specific analysis we assume that the examples are distributed uniformly to the tree. This assumption lets us approximate the number of subtrees that are pruned because they do not receive any pruning examples. n nThis paper clarifies the different variants of the Reduced Error Pruning algorithm, brings new insight to its algorithmic properties, analyses the algorithm with less imposed assumptions than before, and includes the previously overlooked empty subtrees to the analysis.
Data Mining and Knowledge Discovery | 2004
Tapio Elomaa; Juho Rousu
We consider multisplitting of numerical value ranges, a task that is encountered as a discretization step preceding induction and also embedded into learning algorithms. We are interested in finding the partition that optimizes the value of a given attribute evaluation function. For most commonly used evaluation functions this task takes quadratic time in the number of potential cut points in the numerical range. Hence, it is a potential bottleneck in data mining algorithms.We present two techniques that speed up the optimal multisplitting task. The first one aims at discarding cut point candidates in a quick linear-time preprocessing scan before embarking on the actual search. We generalize the definition of boundary points by Fayyad and Irani to allow us to merge adjacent example blocks that have the same relative class distribution. We prove for several commonly used evaluation functions that this processing removes only suboptimal cut points. Hence, the algorithm does not lose optimality.Our second technique tackles the quadratic-time dynamic programming algorithm, which is the best schema for optimizing many well-known evaluation functions. We present a technique that dynamically—i.e., during the search—prunes partitions of prefixes of the sorted data from the search space of the algorithm. The method works for all convex and cumulative evaluation functions.Together the use of these two techniques speeds up the multisplitting process considerably. Compared to the baseline dynamic programming algorithm the speed-up is around 50 percent on the average and up to 90 percent in some cases. We conclude that optimal multisplitting is fully feasible on all benchmark data sets we have encountered.
Pattern Recognition | 2003
Ilkka Autio; Tapio Elomaa
Abstract Real-world scenes are hard to segment into (relevant) objects and (irrelevant) background. In this paper, we argue for view-based vision, which does not use segmentation, and demonstrate a practical approach for recognizing textured objects and scenes in office environments. A small set of Gabor filters is used to preprocess texture combinations from input images. The impulse responses of the filters are transformed into feature vectors that are fed to support vector machines. Pairwise feature comparisons decide the classification of views. We validate the approach on a robot platform using three different types of target objects and indoor scenes: people, doorways, and written signs. The general-purpose system can run in real time, and that recognition accuracies of practical utility are obtained.
intelligent data analysis | 1999
Tapio Elomaa
Post pruning of decision trees has been a successful approach in many real-world experiments, but over all possible concepts it does not bring any inherent improvement to an algorithms performance. This work explores how a PAC-proven decision tree learning algorithm fares in comparison with two variants of the normal top-down induction of decision trees. The algorithm does not prune its hypothesis per se, but it can be understood to do pre-pruning of the evolving tree. We study a backtracking search algorithm, called Rank, for learning rank-minimal decision trees. Our experiments follow closely those performed by Schaffer [20]. They confirm the main findings of Schaffer: in learning concepts with simple description pruning works, for concepts with a complex description and when all concepts are equally likely pruning is injurious, rather than beneficial, to the average performance of the greedy top-down induction of decision trees. Pre-pruning, as a gentler technique, settles in the average performance in the middle ground between not pruning at all and post pruning.
european conference on machine learning | 1994
Tapio Elomaa; Esko Ukkonen
We propose a new method for selecting features, or deciding on splitting points in inductive learning. Its main innovation is to take the positions of examples into account instead of just considering the numbers of examples from different classes that fall at different sides of a splitting rule. The method gives rise to a family of feature selection techniques. We demonstrate the promise of the developed method with initial empirical experiments in connection of top-down induction of decision trees.
Knowledge and Information Systems | 2003
Tapio Elomaa; Juho Rousu
Abstract.The time complexities of class-driven numerical range discretization algorithmsndepend on the number of cut point candidates. Previous analysis has shown that only ansubset of all cut points - the segment borders - have to be taken into account in optimalndiscretization with respect to many goodness criteria. In this paper we show that inspectingnsegment borders alone suffices in optimizing any convex evaluation function. For strictlynconvex evaluation functions inspecting all of them also is necessary, since the placementnof neighboring cut points affects the optimality of a segment border. With the training setnerror function, which is not strictly convex, it suffices to inspect an even smaller set ofncut point candidates, called alternations, when striving for optimal partition. On the othernhand, we prove that failing to check an alternation may lead to suboptimal discretization.nWe present a linear-time algorithm for finding all alternation points. The number ofnalternation points is typically much lower than the total number of cut points. In ournexperiments running the discretization algorithm over the sequence of alternation pointsnled to a significant speed-up.
discovery science | 2006
Heidi Koivistoinen; Minna Ruuska; Tapio Elomaa
Clustering is a basic tool in unsupervised machine learning and data mining. Distance-based clustering algorithms rarely have the means to autonomously come up with the correct number of clusters from the data. A recent approach to identifying the natural clusters is to compare the point densities in different parts of the sample space. n nIn this paper we put forward an agglomerative clustering algorithm which accesses density information by constructing a Voronoi diagram for the input sample. The volumes of the point cells directly reflect the point density in the respective parts of the instance space. Scanning through the input points and their Voronoi cells once, we combine the densest parts of the instance space into clusters. n nOur empirical experiments demonstrate the proposed algorithm is able to come up with a high-accuracy clustering for many different types of data. The Voronoi approach clearly outperforms k-means algorithm on data conforming to its underlying assumptions.
Algorithms and Applications | 2010
Tapio Elomaa; Jussi Kujala
The greedy algorithm is known to have a guaranteed approximation performance in many variations of the well-known minimum set cover problem. We analyze the number of elements covered by the greedy algorithm for the minimum set cover problem, when executed for k rounds. This analysis quite easily yields in the p-partial cover problem over a ground set of m elements the harmonic approximation guarantee H(⌈pm⌉) for the number of required covering sets. Thus, we tie together the coverage analysis of the greedy algorithm for minimum set cover and its dual problem partial cover.
international conference on data mining | 2009
Jussi Kujala; Timo Aho; Tapio Elomaa
This paper studies how useful the standard 2-norm regularized SVM is in approximating the 1-norm SVM problem. To this end, we examine a general method that is based on iteratively re-weighting the features and solving a 2-norm optimization problem. The convergence rate of this method is unknown. Previous work indicates that it might require an excessive number of iterations. We study how well we can do with just a small number of iterations. In theory the convergence rate is fast, except for coordinates of the current solution that are close to zero. Our empirical experiments confirm this. In many problems with irrelevant features, already one iteration is often enough to produce accuracy as good as or better than that of the 1-norm SVM. Hence, it seems that in these problems we do not need to converge to the 1-norm SVM solution near zero values. The benefit of this approach is that we can build something similar to the 1-norm regularized solver based on any 2-norm regularized solver. This is quick to implement and the solution inherits the good qualities of the solver such as scalability and stability.