Ivan Bruha
McMaster University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ivan Bruha.
Sigkdd Explorations | 2000
Ivan Bruha; A. Famili
This article surveys the contents of the workshop Post-Processing in Machine Learning and Data Mining: Interpretation, Visualization, Integration, and Related Topics within KDD-2000: The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 20-23 August 2000. The corresponding web site is on www.acm.org/sigkdd/kdd2000 First, this survey paper introduces the state of the art of the workshop topics, emphasizing that postprocessing forms a significant component in Knowledge Discovery in Databases (KDD). Next, the article brings up a report on the contents, analysis, discussion, and other aspects regarding this workshop. Afterwards, we survey all the workshop papers. They can be found at (and downloaded from) www.cas.mcmaster.ca/~bruha/kdd2000/kddrep.html The authors of this report worked as the organizers of the workshop; the programme committee was formed by additional three researches in this field.
european conference on principles of data mining and knowledge discovery | 1998
Petr Berka; Ivan Bruha
Unlike on-line discretization performed by a number of machine learning (ML) algorithms for building decision trees or decision rules, we propose off-line algorithms for discretizing numerical attributes and grouping values of nominal attributes. The number of resulting intervals obtained by discretization depends only on the data; the number of groups corresponds to the number of classes. Since both discretization and grouping is done with respect to the goal classes, the algorithms are suitable only for classification/prediction tasks.
International Journal of Pattern Recognition and Artificial Intelligence | 1998
Petr Berka; Ivan Bruha
The genuine symbolic machine learning (ML) algorithms are capable of processing symbolic, categorial data only. However, real-world problems, e.g. in medicine or finance, involve both symbolic and ...
intelligent information systems | 2004
Ivan Bruha
Efficient robust data mining algorithms should comprise some routines for processing unknown (missing) attribute values when acquiring knowledge from real-world databases because these data usually contain a certain percentage of missing values. The paper Bruha and Franek (1996) figures out that each dataset has more or less its own ‘favourite’ routine for processing unknown attribute values. It evidently depends on the magnitude of noise and source of unknownness in each dataset. One possibility how to choose an efficient routine for processing unknown attribute values for a given database is exhibited in this paper. The covering machine learning algorithm CN4, a large extension of the well-known CN2 algorithm, is used here as an inductive vehicle.Each of the six routines for unknown attribute value processing (which are available in CN4) is used independently in order to process a given database. Afterwards, a meta-learner is used to derive a meta-classifier that makes up the overall (final) decision about the class of input unseen objects. The entire system is called a meta-combiner.The meta-database that is formed for the meta-learner could be inconsistent which could decrease the performance of the entire meta-classifier. Therefore, the existing meta-system (Meta-CN4) has been enhanced by a ‘purification’ procedure that appropriately solves up the conflict of inconsistent meta-data.The paper first surveys the CN4 algorithms including its six routines for unknown attribute value processing. Afterwards, it introduces the methodology of the meta-learner including its enhancement that solves inconsistent meta-databases. Finally, the results of experiments with various percentages of unknown attribute values on real-world data are presented and performances of the meta-classifier and the six base classifiers are then compared. The paper also explains the difference between the meta-combiner (meta-learner) described here and the cross-validation procedure used for obtaining the classification accuracy.
International Journal of Pattern Recognition and Artificial Intelligence | 1996
Ivan Bruha; Frantisek Franek
Simple inductive learning algorithms assume that all attribute values are available. The well-known Quinlans paper1 discusses quite a few routines for the processing of unknown attribute values in the TDIDT family and analyzes seven of them. This paper introduces five routines for the processing of unknown attribute values that have been designed for the CN4 learning algorithm, a large extension of the well-known CN2. Both algorithms CN2 and CN4 induce lists of decision rules from examples applying the covering paradigm. CN2 offers two ways for the processing of unknown attribute values. The CN4s five routines differ in style of matching complexes with examples (objects) that involve unknown attribute values. The definition of matching is discussed in detail in the paper. The strategy of unknown value processing is described both for learning and classification phases in individual routines. The results of experiments with various percentages of unknown attribute values on real-world (mostly medical) data are presented and performances of all five routines are compared.
systems man and cybernetics | 1988
Ivan Bruha; G.P. Madhavan
A pattern recognition system for classifying brain-stem auditory evoked potential is described. A string of terminal symbols, as a formal representation of the evoked potential waveform, is processed by a regular attributed grammar. Its semantic functions return a list of numeric features that can be processed by a simple statistical classifier. Implementation of attributed grammars is discussed. >
international conference of the ieee engineering in medicine and biology society | 1989
Ivan Bruha; Gopal P. Madhavan
A decision-support system for neurological diagnoses is described. The evoked-potential waveforms are analyzed by a syntax pattern recognition algorithm that returns a list of numerical features. The second processing step includes a neural net (two-layer perception) that processes the list of numerical features. Methods of incorporating a knowledge-based subsystem in the evoked-potential recognition system in order to obtain more reliable results are analyzed.<<ETX>>
Cybernetics and Systems | 1989
Ivan Bruha
This article attempts to specify the fundamental differences between adaptive systems and learning systems from an artificial intelligence viewpoint, and the necessary conditions of adaptation and learning. Less formal definitions of an adaptive system, an (inductive) learning system from examples, and an (inductive) learning system from observation are presented, and the definitions are compared.
Archive | 2000
Ivan Bruha; Petr Berka
Machine learning (ML) algorithms have been capable of processing symbolic, categorial data only. Real-world problems, particularly in medicine, comprise not only symbolic, but also numerical attributes. There are several approaches to discretize (categorize) numerical attributes. This article describes two newer algorithms for such a discretization.
international syposium on methodologies for intelligent systems | 2003
Ivan Bruha
In our project, we have selected the paradigm of genetic algorithms (GAs) for search through a space of all possible concept descriptions. We designed and implemented a system that integrates a domain-independent GA into the covering learning algorithm CN4, a large extension of the well-known CN2; we call it GA-CN4.