Max Bramer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Max Bramer is active.

Explore More

Publication

Featured researches published by Max Bramer.

intelligent data analysis | 1997

Techniques for Dealing with Missing Values in Classification

Wei Zhong Liu; Allan P. White; Simon G. Thompson; Max Bramer

A brief overview of the history of the development of decision tree induction algorithms is followed by a review of techniques for dealing with missing attribute values in the operation of these methods. The technique of dynamic path generation is described in the context of tree-based classification methods. The waste of data which can result from casewise deletion of missing values in statistical algorithms is discussed and alternatives proposed.

Archive | 2000

Automatic Induction of Classification Rules from Examples Using N-Prism

Max Bramer

One of the key technologies of data mining is the automatic induction of rules from examples, particularly the induction of classification rules. Most work in this field has concentrated on the generation of such rules in the intermediate form of decision trees. An alternative approach is to generate modular classification rules directly from the examples. This paper seeks to establish a revised form of the rule generation algorithm Prism as a credible candidate for use in the automatic induction of classification rules from examples in practical domains where noise may be present and where predicting the classification for previously unseen instances is the primary focus of attention.

Knowledge Based Systems | 2002

Using J-Pruning to Reduce Overfitting in Classification Trees

Max Bramer

The automatic induction of classification rules from examples in the form of a decision tree is an important technique used in data mining. One of the problems encountered is the overfitting of rules to training data. In some cases this can lead to an excessively large number of rules, many of which have very little predictive value for unseen data. This paper is concerned with the reduction of overfitting during decision tree generation. It introduces a technique known as J-pruning, based on the J-measure, an information theoretic means of quantifying the information content of a rule.

international conference on intelligent information processing | 2002

An Information-Theoretic Approach to the Pre-pruning of Classification Rules

Max Bramer

The automatic induction of classification rules from examples is an important technique used in data mining. One of the problems encountered is the overfitting of rules to training data. In some cases this can lead to an excessively large number of rules, many of which have very little predictive value for unseen data. This paper is concerned with the reduction of overfitting. It introduces a technique known as J-pruning, based on the J-measure, an information theoretic means of quantifying the information content of a rule and applies this to two rule induction methods: one where the rules are generated via the intermediate representation of a decision tree and one where rules are generated directly from examples.

international conference on tools with artificial intelligence | 2010

Pocket Data Mining: Towards Collaborative Data Mining in Mobile Computing Environments

Frederic T. Stahl; Mohamed Medhat Gaber; Max Bramer; Philip S. Yu

Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.

Knowledge Based Systems | 2012

Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks

Frederic T. Stahl; Max Bramer

In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classification rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classification rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.

International Journal of Systems Science | 2005

Inducer: a public domain workbench for data mining

Max Bramer

This paper describes the facilities available in Inducer, a public domain classification workbench aimed at users who wish to analyse their own datasets using a range of data mining strategies or to conduct experiments with a given technique or combination of techniques across a range of datasets. Inducer has a graphical user interface which is designed to be easy-to-use by beginners, but also includes a range of advanced features for experienced users, including facilities to export the information generated in a form suitable for further processing by other packages. Experiments using the workbench are described.

Knowledge Based Systems | 2012

Jmax-pruning: A facility for the information theoretic pruning of modular classification rules

Frederic T. Stahl; Max Bramer

The Prism family of algorithms induces modular classification rules in contrast to the Top Down Induction of Decision Trees (TDIDT) approach which induces classification rules in the intermediate form of a tree structure. Both approaches achieve a comparable classification accuracy. However in some cases Prism outperforms TDIDT. For both approaches pre-pruning facilities have been developed in order to prevent the induced classifiers from overfitting on noisy datasets, by cutting rule terms or whole rules or by truncating decision trees according to certain metrics. There have been many pre-pruning mechanisms developed for the TDIDT approach, but for the Prism family the only existing pre-pruning facility is J-pruning. J-pruning not only works on Prism algorithms but also on TDIDT. Although it has been shown that J-pruning produces good results, this work points out that J-pruning does not use its full potential. The original J-pruning facility is examined and the use of a new pre-pruning facility, called Jmax-pruning, is proposed and evaluated empirically. A possible pre-pruning facility for TDIDT based on Jmax-pruning is also discussed.

Archive | 1999

Applications and Innovations in Expert Systems VI

Robert W. Milne; Ann Macintosh; Max Bramer

Knowledge management is currently attracting a great deal of interest in the business community. Organisations are increasingly corning to regard their knowledge as a key asset and resource. How it is supported and developed are seen as issues of strategic importance. Meeting these needs requires interventions in four key dimensions: people, process, technology and content. Expert, or knowledge-based systems, as one of few disciplines deeply engaged with the issues of knowledge, thinking and decision-making, can potentially play an important role in knowledge management. In this paper the author draws on practical experience and original research to explain the knowledge management phenomenon and illustrate the practical role that expert systems can play, and are already playing in this emerging field. 1 Knowledge Management Whether making a decision, assessing a proposition, forecasting a trend, designing a new facility, diagnosing a problem, understanding customer needs or making a plan, organisations rely on knowledge in all the intelligent, judgmental tasks they perform. Customer knowledge, product knowledge, competitor knowledge, process knowledge: Nobody doubts the value and importance of knowledge in business. Indeed, for many organisations today these intangible competencies are thought to be a more important economic factor than capital, labour or other resources. Until recently, however, this vital resource has usually suffered disproportionately low attention because it has been poorly understood by business. Problems such as bad decision-making, repeated mistakes and the failure to innovate and share experience have been recognised, but ready solutions have not seemed available. Now, however, the emerging field of Knowledge Management is providing new insight into how to support the processes whereby knowledge is created, disseminated, applied, captured, stored, exploited and valued.

machine learning and data mining in pattern recognition | 2009

PMCRI: A Parallel Modular Classification Rule Induction Framework

Frederic T. Stahl; Max Bramer; Mo Adda

In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.

Explore More