Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wilhelmiina Hämäläinen is active.

Publication


Featured researches published by Wilhelmiina Hämäläinen.


intelligent tutoring systems | 2006

Comparison of machine learning methods for intelligent tutoring systems

Wilhelmiina Hämäläinen; Mikko Vinni

To implement real intelligence or adaptivity, the models for intelligent tutoring systems should be learnt from data. However, the educational data sets are so small that machine learning methods cannot be applied directly. In this paper, we tackle this problem, and give general outlines for creating accurate classifiers for educational data. We describe our experiment, where we were able to predict course success with more than 80% accuracy in the middle of course, given only hundred rows of data.


international conference on data mining | 2008

Efficient Discovery of Statistically Significant Association Rules

Wilhelmiina Hämäläinen; Matti Nykänen

Searching statistically significant association rules is an important but neglected problem. Traditional association rules do not capture the idea of statistical dependence and the resulting rules can be spurious, while the most significant rules may be missing. This leads to erroneous models and predictions which often become expensive.The problem is computationally very difficult, because the significance is not a monotonic property. However, in this paper we prove several other properties, which can be used for pruning the search space. The properties are implemented in the StatApriori algorithm, which searches statistically significant, non-redundant association rules. Based on both theoretical and empirical observations, the resulting rules are very accurate compared to traditional association rules. In addition, StatApriori can work with extremely low frequencies, thus finding new interesting rules.


Knowledge and Information Systems | 2010

StatApriori: an efficient algorithm for searching statistically significant association rules

Wilhelmiina Hämäläinen

Searching statistically significant association rules is an important but neglected problem. Traditional association rules do not capture the idea of statistical dependence and the resulting rules can be spurious, while the most significant rules may be missing. This leads to erroneous models and predictions which often become expensive. The problem is computationally very difficult, because the significance is not a monotonic property. However, in this paper, we prove several other properties, which can be used for pruning the search space. The properties are implemented in the StatApriori algorithm, which searches statistically significant, non-redundant association rules. Empirical experiments have shown that StatApriori is very efficient, but in the same time it finds good quality rules.


Knowledge and Information Systems | 2012

Kingfisher: an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures

Wilhelmiina Hämäläinen

Statistical dependency analysis is the basis of all empirical science. A commonly occurring problem is to find the most significant dependency rules, which describe either positive or negative dependencies between categorical attributes. In medical science, for example, one is interested in genetic factors, which can either predispose or prevent diseases. The requirement of statistical significance is essential, because the discoveries should hold also in future data. Typically, the significance is estimated either by Fisher’s exact test or the χ2-measure. The problem is computationally very difficult, because the number of all possible dependency rules increases exponentially with the number of attributes. As a solution, different kinds of restrictions and heuristics have been applied, but a general, scalable search method has been missing. In this paper, we introduce an efficient algorithm, called Kingfisher, for searching for the best non-redundant dependency rules with statistical significance measures. The rules can express either positive or negative dependencies between a set of positive attributes and a single consequent attribute. The algorithm itself is independent from the used goodness measure, but we concentrate on Fisher’s exact test and the χ2-measure. The algorithm is based on an application of the branch-and-bound search strategy, supplemented by several pruning properties. Especially, we prove a new lower bound for Fisher’s p and introduce a new effective pruning principle. According to our experiments on classical benchmark data, the algorithm is well scalable and can efficiently handle even dense and high-dimensional data sets. An interesting observation was that Fisher’s exact test did not only produce more reliable rules than the χ2-measure, but it also performed the search much faster.


Frontiers in Education | 2004

Problem-based learning of theoretical computer science

Wilhelmiina Hämäläinen

In this paper, we report our first experiment in teaching the theory of computability in the problem-based way. As far as we know, this is the first experiment of applying the problem-based method to a purely theoretical course of computer science. Performing the course consisted of three parts: First, the new subjects were learnt according to the classical seven step method, which contains both individual and group work, and problem reports were written. Second, the students participated in a traditional exercise session, in which the new techniques were practised in details. And third, the students kept a learning diary, in which they processed the subjects further, tried to construct an overall schema of things learnt, and supervised their own learning. The results were really successful: the students committed themselves well and the drop out percentage was very small; they achieved very deep understanding of the subjects measured by their grades and quality of learning diaries; the experience was enjoyable for both the students and the teachers; and finally, the method supported different kinds of learners very well.


intelligent systems design and applications | 2011

Jerk-based feature extraction for robust activity recognition from acceleration data

Wilhelmiina Hämäläinen; Mikko Järvinen; Paula Martiskainen; Jaakko Mononen

A current trend in activity recognition is to use just one easily carried accelerometer, either integrated into a mobile phone, carried in a pocket, or attached to an animals collar. The main disadvantage of this approach is that the orientation of the accelerometer is generally unknown. Therefore, one cannot separate body-related accelerations from the gravitational acceleration or determine the real directions of the observed accelerations accurately. As a solution, we introduce a new technique where jerk (changes of accelerations) is analyzed instead of the original acceleration signal. The total jerk magnitude is completely orientation-independent and it reflects only body-related accelerations. If the direction of the gravitation can be approximated even loosely, then the jerk signal can be further enriched with valuable information on jerk angles (direction changes). According to our experiments this kind of jerk-filtered signal produces robust features and can improve the recognition accuracy remarkably.


international conference on data mining | 2010

Efficient Discovery of the Top-K Optimal Dependency Rules with Fisher's Exact Test of Significance

Wilhelmiina Hämäläinen

Statistical dependency analysis is the basis of all empirical science. A commonly occurring problem is to find the most significant dependency rules, which describe either positive or negative dependencies between categorical attributes. For example, in medical science one is interested in genetic factors, which can either predispose or prevent diseases. The requirement of statistical significance is essential, because the discoveries should hold also in the future data. Typically, the significance is estimated either by Fishers exact test or the


Fundamenta Informaticae | 2011

Efficient Search Methods for Statistical Dependency Rules

Wilhelmiina Hämäläinen

\chi^2


Computational Statistics & Data Analysis | 2016

New upper bounds for tight and fast approximation of Fisher's exact test in dependency rule mining

Wilhelmiina Hämäläinen

-measure. The problem is computationally very difficult, because the number of all possible dependency rules increases exponentially with the number of attributes. As a solution, different kinds of restrictions and heuristics have been applied, but a general, scalable search method has been missing. In this paper, we introduce an efficient algorithm for searching for the top-K globally optimal dependency rules using Fishers exact test as a measure function. The rules can express either positive or negative dependencies between a set of positive attributes and a single consequent attribute. The algorithm is based on an application of the branch-and-bound search strategy, supplemented by several pruning properties. Especially, we prove a new lower-bound for the Fishers p, and introduce a new effective pruning principle. The general search algorithm is applicable to other goodness measures, like the


knowledge discovery and data mining | 2014

Statistically sound pattern discovery

Wilhelmiina Hämäläinen; Geoffrey I. Webb

\chi^2

Collaboration


Dive into the Wilhelmiina Hämäläinen's collaboration.

Top Co-Authors

Avatar

Jaakko Mononen

University of Eastern Finland

View shared research outputs
Top Co-Authors

Avatar

Mikko Järvinen

University of Eastern Finland

View shared research outputs
Top Co-Authors

Avatar

Salla Ruuska

University of Eastern Finland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paula Martiskainen

University of Eastern Finland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ville Kumpulainen

University of Eastern Finland

View shared research outputs
Researchain Logo
Decentralizing Knowledge