Mika Klemettinen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mika Klemettinen is active.

Explore More

Publication

Featured researches published by Mika Klemettinen.

conference on information and knowledge management | 1994

Finding interesting rules from large sets of discovered association rules

Mika Klemettinen; Heikki Mannila; Pirjo Ronkainen; Hannu Toivonen; A. Inkeri Verkamo

Association rules, introduced by Agrawal, Imielinski, and Swami, are rules of the form “for 90% of the rows of the relation, if the row has value 1 in the columns in set W, then it has 1 also in column B”. Efficient methods exist for discovering association rules from large collections of data. The number of discovered rules can, however, be so large that browsing the rule set and finding interesting rules from it can be quite difficult for the user. We show how a simple formalism of rule templates makes it possible to easily describe the structure of interesting rules. We also give examples of visualization of rules, and show how a visualization tool interfaces with rule templates.

international conference on data engineering | 1996

Knowledge discovery from telecommunication network alarm databases

Kimmo Hätönen; Mika Klemettinen; Heikki Mannila; Pirjo Ronkainen; Hannu Toivonen

A telecommunication network produces daily large amounts of alarm data. The data contains hidden valuable knowledge about the behavior of the network. This knowledge can be used in filtering redundant alarms, locating problems in, the network, and possibly in predicting severe faults. We describe the TASA (Telecommunication Network Alarm Sequence Analyzer) system for discovering and browsing knowledge from large alarm databases. The system is built on the basis of viewing knowledge discovery as an interactive and iterative process, containing data collection, pattern discovery, rule postprocessing, etc. The system uses a novel framework for locating frequently occurring episodes from sequential data. The TASA system offers a variety of selection and ordering criteria for episodes, and supports iterative retrieval from the discovered knowledge. This means that a large part of the iterative nature of the KDD process can be replaced by iteration in the rule postprocessing stage. The user interface is based on dynamically generated HTML. The system is in experimental use, and the results are encouraging: some of the discovered knowledge is being integrated into the alarm handling software of telecommunication operators.

Journal of Network and Systems Management | 1999

Rule Discovery in Telecommunication AlarmData

Mika Klemettinen; Heikki Mannila; Hannu Toivonen

Fault management is an important but difficultarea of telecommunication network management: networksproduce large amounts of alarm information which must beanalyzed and interpreted before faults can be located. So called alarm correlation is acentral technique in fault identification. While the useof alarm correlation systems is quite popular andmethods for expressing the correlations are maturing, acquiring all the knowledge necessary forconstructing an alarm correlation system for a networkand its elements is difficult. We describe a novelpartial solution to the task of knowledge acquisition for correlation systems. We present a methodand a tool for the discovery of recurrent patterns ofalarms in databases; these patterns, episode rules, canbe used in the construction of real-time alarm correlation systems. We also present tools withwhich network management experts can browse the largeamounts of rules produced. The construction ofcorrelation systems becomes easier with these tools, as the episode rules provide a wealth ofstatistical information about recurrent phenomena in thealarm stream. This methodology has been implemented ina research system called TASA, which is used by several telecommunication operators. We briefly discussexperiences in the use of TASA.

data warehousing and knowledge discovery | 2002

Mining Association Rules from XML Data

Daniele Braga; Alessandro Campi; Mika Klemettinen; Pier Luca Lanzi

The eXtensible Markup Language (XML) rapidly emerged as a standard for representing and exchanging information. The fastgrowing amount of available XML data sets a pressing need for languages and tools to manage collections of XML documents, as well as to mine interesting information out of them. Although the data mining community has not yet rushed into the use of XML, there have been some proposals to exploit XML. However, in practice these proposals mainly rely on more or less traditional relational databases with an XML interface. In this paper, we introduce association rules from native XML documents and discuss the new challenges and opportunities that this topic sets to the data mining community. More specifically, we introduce an extension of XQuery for mining association rules. This extension is used throughout the paper to better define association rule mining within XML and to emphasize its implications in the XML context.

data warehousing and knowledge discovery | 1999

Modeling KDD Processes within the Inductive Database Framework

Jean-François Boulicaut; Mika Klemettinen; Heikki Mannila

One of the most challenging problems in data manipulation in the future is to be able to efficiently handle very large databases but also multiple induced properties or generalizations in that data. Popular examples of useful properties are association rules, and inclusion and functional dependencies. Our view of a possible approach for this task is to specify and query inductive databases, which are databases that in addition to data also contain intensionally defined generalizations about the data. We formalize this concept and show how it can be used throughout the whole process of data mining due to the closure property of the framework. We show that simple query languages can be defined using normal database terminology. We demonstrate the use of this framework to model typical data mining processes. It is then possible to perform various tasks on these descriptions like, e.g., optimizing the selection of interesting properties or comparing two processes.

acm symposium on applied computing | 2003

Discovering interesting information in XML data with association rules

Daniele Braga; Alessandro Campi; Stefano Ceri; Mika Klemettinen; Pier Luca Lanzi

Data mining algorithms are designed to extract interesting information from large amounts of data. They usually assume that source data are in relational (tabular) from. However, the recent success of XML as a standard to represent semi-structured data and the increasing amount of data available in XML pose new challenges to the data mining community. In this paper we introduce association rules for XML data. To accomplish this, we propose a new operator, based on XPath and inspired by the syntax of XQuery, which allows us to express complex mining tasks, compactly and intuitively. The operator can indifferently (and simultaneously) target both the content and the structure of the data, since the distinction in XML is slight.

network operations and management symposium | 1996

TASA: Telecommunication Alarm Sequence Analyzer or how to enjoy faults in your network

Kimmo Hätönen; Mika Klemettinen; Heikki Mannila; Pirjo Ronkainen; Hannu Toivonen

Todays large and complex telecommunication networks produce large amounts of alarms daily. The sequence of alarms contains valuable knowledge about the behavior of the network, but much of the knowledge is fragmented and hidden in the vast amount of data. Regularities in the alarms can be used in fault management applications, e.g., for filtering redundant alarms, locating problems in the network, and possibly in predicting severe faults. In this paper we describe TASA (Telecommunication Alarm Sequence Analyzer), a novel system for discovering interesting regularities in the alarms. In the core of the system are algorithms for locating frequent alarm episodes from the alarm stream and presenting them as rules. Discovered rules can then be explored with flexible information retrieval tools that support iteration. The user interface is hypertext, based on HTML, and can be used with a standard WWW browser. TASA is in experimental use and has already discovered rules that have been integrated into the alarm handling software of an operator.

european conference on principles of data mining and knowledge discovery | 1998

Querying Inductive Databases: A Case Study on the MINE RULE Operator

Jean Fran cois Boulicaut; Mika Klemettinen; Heikki Mannila

Knowledge discovery in databases (KDD) is a process that can include steps like forming the data set, data transformations, discovery of patterns, searching for exceptions to a pattern, zooming on a subset of the data, and postprocessing some patterns. We describe a comprehensive framework in which all these steps can be carried out by means of queries over an inductive database. An inductive database is a database that in addition to data also contains intensionally defined generalizations about the data. We formalize this concept: an inductive database consists of a normal database together with a subset of patterns from a class of patterns, and an evaluation function that tells how the patterns occur in the data. Then, looking for potential query languages built on top of SQL, we consider the research on the MINE RULE operator by Meo, Psaila and Ceri. It is a serious step towards an implementation framework for inductive databases, though it addresses only the association rule mining problem. Perspectives are then discussed.

international conference on tools with artificial intelligence | 2002

A tool for extracting XML association rules

Daniele Braga; Alessandro Campi; Stefano Ceri; Mika Klemettinen; Pier Luca Lanzi

The recent success of XML as a standard to represent semi-structured data, and the increasing amount of available XML data pose new challenges to the data mining community. In this paper we present the XMINE operator, a tool developed to extract XML association rules for XML documents. The operator, based on XPath and inspired by the syntax of XQuery, allows us to express complex mining tasks, compactly and intuitively. XMINE can be used to specify indifferently (and simultaneously) mining tasks both on the content and on the structure of the data, since the distinction in XML is slight.

european conference on principles of data mining and knowledge discovery | 1997

Mining in the Phrasal Frontier

Helena Ahonen; Oskari Heinonen; Mika Klemettinen; A. Inkeri Verkamo

Data mining methods have been applied to a wide variety of domains. Surprisingly enough, only a few examples of data mining in text are available. However, considering the amount of existing document collections, text mining would be most useful. Traditionally, texts have been analysed using various information retrieval related methods and natural language processing. In this paper, we present our first experiments in applying general methods of data mining to discovering phrases and co-occurring terms. We also describe the text mining process developed. Our results show that data mining methods — with appropriate preprocessing — can be used in text processing, and that by shifting the focus the process can be used to obtain results for various purposes.

Explore More