Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Heikki Mannila is active.

Publication


Featured researches published by Heikki Mannila.


Data Mining and Knowledge Discovery | 1997

Discovery of Frequent Episodes in Event Sequences

Heikki Mannila; Hannu Toivonen; A. Inkeri Verkamo

Sequences of events describing the behavior and actions of users or systems can be collected in several domains. An episode is a collection of events that occur relatively close to each other in a given partial order. We consider the problem of discovering frequently occurring episodes in a sequence. Once such episodes are known, one can produce rules for describing or predicting the behavior of the sequence. We give efficient algorithms for the discovery of all frequent episodes from a given class of episodes, and present detailed experimental results. The methods are in use in telecommunication alarm management.


Data Mining and Knowledge Discovery | 1997

Levelwise Search and Borders of Theories in KnowledgeDiscovery

Heikki Mannila; Hannu Toivonen

AbstractOne of the basic problems in knowledge discovery in databases (KDD) is the following: given a data set r, a class L of sentences for defining subgroups of r, and a selection predicate, find all sentences of L deemed interesting by the selection predicate. We analyze the simple levelwise algorithm for finding all such descriptions. We give bounds for the number of database accesses that the algorithm makes. For this, we introduce the concept of the border of a theory, a notion that turns out to be surprisingly powerful in analyzing the algorithm. We also consider the verification problem of a KDD process: given r and a set of sentences S


knowledge discovery and data mining | 2001

Random projection in dimensionality reduction: applications to image and text data

Ella Bingham; Heikki Mannila


conference on information and knowledge management | 1994

Finding interesting rules from large sets of discovered association rules

Mika Klemettinen; Heikki Mannila; Pirjo Ronkainen; Hannu Toivonen; A. Inkeri Verkamo

\subseteq


ACM Transactions on Knowledge Discovery from Data (TKDD) | 2007

Clustering aggregation

Aristides Gionis; Heikki Mannila; Panayiotis Tsaparas


Communications of The ACM | 1996

A database perspective on knowledge discovery

Tomasz Imielinski; Heikki Mannila

L determine whether S is exactly the set of interesting statements about r. We show strong connections between the verification problem and the hypergraph transversal problem. The verification problem arises in a natural way when using sampling to speed up the pattern discovery step in KDD.


ACM Transactions on Database Systems | 1997

Disjunctive datalog

Thomas Eiter; Georg Gottlob; Heikki Mannila

Random projections have recently emerged as a powerful method for dimensionality reduction. Theoretical results indicate that the method preserves distances quite nicely; however, empirical results are sparse. We present experimental results on using random projection as a dimensionality reduction tool in a number of cases, where the high dimensionality of the data would otherwise lead to burden-some computations. Our application areas are the processing of both noisy and noiseless images, and information retrieval in text documents. We show that projecting the data onto a random lower-dimensional subspace yields results comparable to conventional dimensionality reduction methods such as principal component analysis: the similarity of data vectors is preserved well under random projection. However, using random projections is computationally significantly less expensive than using, e.g., principal component analysis. We also show experimentally that using a sparse random matrix gives additional computational savings in random projection.


American Journal of Human Genetics | 1999

A Genomewide Screen for Schizophrenia Genes in an Isolated Finnish Subpopulation, Suggesting Multiple Susceptibility Loci

Iiris Hovatta; Teppo Varilo; Jaana Suvisaari; Joseph D. Terwilliger; Vesa Ollikainen; Ritva Arajärvi; Hannu Juvonen; Marja-Liisa Kokko-Sahin; Leena Väisänen; Heikki Mannila; Jouko Lönnqvist; Leena Peltonen

Association rules, introduced by Agrawal, Imielinski, and Swami, are rules of the form “for 90% of the rows of the relation, if the row has value 1 in the columns in set W, then it has 1 also in column B”. Efficient methods exist for discovering association rules from large collections of data. The number of discovered rules can, however, be so large that browsing the rule set and finding interesting rules from it can be quite difficult for the user. We show how a simple formalism of rule templates makes it possible to easily describe the structure of interesting rules. We also give examples of visualization of rules, and show how a visualization tool interfaces with rule templates.


european conference on principles of data mining and knowledge discovery | 1997

Finding Similar Time Series

Gautam Das; Dimitrios Gunopulos; Heikki Mannila

We consider the following problem: given a set of clusterings, find a clustering that agrees as much as possible with the given clusterings. This problem, clustering aggregation, appears naturally in various contexts. For example, clustering categorical data is an instance of the problem: each categorical variable can be viewed as a clustering of the input rows. Moreover, clustering aggregation can be used as a meta-clustering method to improve the robustness of clusterings. The problem formulation does not require a-priori information about the number of clusters, and it gives a natural way for handling missing values. We give a formal statement of the clustering-aggregation problem, we discuss related work, and we suggest a number of algorithms. For several of the methods we provide theoretical guarantees on the quality of the solutions. We also show how sampling can be used to scale the algorithms for large data sets. We give an extensive empirical evaluation demonstrating the usefulness of the problem and of the solutions.


international conference on data mining | 2001

Time series segmentation for context recognition in mobile devices

Johan Himberg; Kalle Korpiaho; Heikki Mannila; Johanna Tikanmäki; Hannu Toivonen

DATABASE MINING IS NOT SIMPLY ANOTHER buzzword for statistical data analysis or inductive learning. Database mining sets new challenges to database technology: new concepts and methods are needed for query languages, basic operations, and query processing strategies. The most important new component is the ad hoc nature of knowledge and data discovery (KDD) queries and the need for efficient query compilation into a multitude of existing and new data analysis methods. Hence, database mining builds upon the existing body of work in statistics and machine learning but provides completely new functionalities. The current generation of database systems are designed mainly to support business applications. The success of Structured Query Language (SQL) has capitalized on a small number of primitives sufficient to support a vast majority of such applications. Unfortunately, these primitives are not sufficient to capture the emerging family of new applications dealing with knowledge discovery. Most current KDD systems offer isolated discovery features using tree inducers, neural nets, and rule discovery algorithms. Such systems cannot be embedded into a large application and typically offer just one knowledge dis-The concept of data mining as a querying process and the first steps toward efficient development of knowledge discovery applications are discussed.

Collaboration


Dive into the Heikki Mannila's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Padhraic Smyth

University of California

View shared research outputs
Top Co-Authors

Avatar

Jouni K. Seppänen

Helsinki University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ella Bingham

Helsinki University of Technology

View shared research outputs
Top Co-Authors

Avatar

Mikko Koivisto

Helsinki Institute for Information Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge