Kangil Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kangil Kim is active.

Explore More

Publication

Featured researches published by Kangil Kim.

Genetic Programming and Evolvable Machines | 2014

Probabilistic model building in genetic programming: a critical review

Kangil Kim; Yin Shan; Xuan Hoai Nguyen; Robert I. McKay

Probabilistic model-building algorithms (PMBA), a subset of evolutionary algorithms, have been successful in solving complex problems, in addition providing analytical information about the distribution of fit individuals. Most PMBA work has concentrated on the string representation used in typical genetic algorithms. A smaller body of work has aimed to apply the useful concepts of PMBA to genetic programming (GP), mostly concentrating on tree representation. Unfortunately, the latter research has been sporadically carried out, and reported in several different research streams, limiting substantial communication and discussion. In this paper, we aim to provide a critical review of previous applications of PMBA and related methods in GP research, to facilitate more vital communication. We illustrate the current state of research in applying PMBA to GP, noting important perspectives. We use these to categorise practical PMBA models for GP, and describe the main varieties on this basis.

pacific rim international conference on artificial intelligence | 2010

Sampling bias in estimation of distribution algorithms for genetic programming using prototype trees

Kangil Kim; Bob McKay; Dharani Punithan

Probabilistic models are widely used in evolutionary and related algorithms. In Genetic Programming (GP), the Probabilistic Prototype Tree (PPT) is often used as a model representation. Drift due to sampling bias is a widely recognised problem, and may be serious, particularly in dependent probability models. While this has been closely studied in independent probability models, and more recently in probabilistic dependency models, it has received little attention in systems with strict dependence between probabilistic variables such as arise in PPT representation. Here, we investigate this issue, and present results suggesting that the drift effect in such models may be particularly severe - so severe as to cast doubt on their scalability.We present a preliminary analysis through a factor representation of the joint probability distribution. We suggest future directions for research aiming to overcome this problem.

IEEE Transactions on Evolutionary Computation | 2013

Stochastic Diversity Loss and Scalability in Estimation of Distribution Genetic Programming

Kangil Kim; Robert I. McKay

In estimation of distribution algorithms (EDAs), probability models hold accumulating evidence on the location of an optimum. Stochastic sampling drift has been heavily researched in EDA optimization but not in EDAs applied to genetic programming (EDA-GP). We show that, for EDA-GPs using probabilistic prototype tree models, stochastic drift in sampling and selection is a serious problem, inhibiting scaling to complex problems. Problems requiring deep dependence in their probability structure see such rapid stochastic drift that the usual methods for controlling drift are unable to compensate. We propose a new alternative, analogous to likelihood weighting of evidence. We demonstrate in a small-scale experiment that it does counteract the drift, sufficiently to leave EDA-GP systems subject to similar levels of stochastic drift to other EDAs.

european conference on genetic programming | 2011

Operator self-adaptation in genetic programming

Min Hyeok Kim; Robert I. McKay; Nguyen Xuan Hoai; Kangil Kim

We investigate the application of adaptive operator selection rates to Genetic Programming. Results confirm those from other areas of evolutionary algorithms: adaptive rate selection out-performs non-adaptive methods, and among adaptive methods, adaptive pursuit out-performs probability matching. Adaptive pursuit combined with a reward policy that rewards the overall fitness change in the elite worked best of the strategies tested, though not uniformly on all problems.

IEEE Transactions on Evolutionary Computation | 2016

Recursion-Based Biases in Stochastic Grammar Model Genetic Programming

Kangil Kim; R. I. Bob McKay; Nguyen Xuan Hoai

The estimation of distribution algorithms (EDAs) applied to genetic programming (GP) have been studied by a number of authors. Like all EDAs, they suffer from biases induced by the model building and sampling process. However, the biases are amplified in the algorithms for GP. In particular, many systems use stochastic grammars as their model representation, but biases arise due to grammar recursion. We define and estimate the bias due to recursion in grammar-based EDAs in GP, using methods derived from computational linguistics. We confirm the extent of bias in some simple experimental examples. We then propose some methods to repair this bias. We apply the estimation of bias, and its repair, to some more practical applications. We experimentally demonstrate the extent of bias arising from recursion, and the performance improvements that can result from correcting it.

congress on evolutionary computation | 2013

Cutting evaluation costs: An investigation into early termination in genetic programming

Namyong Park; Kangil Kim; Robert I. McKay

Genetic programming is very computationally intensive, particularly in CPU time. A number of approaches to evaluation cost reduction have been proposed, among them early termination of evaluation (applicable in problem domains where estimates of the final fitness value are available during evaluation). Like all cost reduction techniques, early termination balances overall computation cost against the risk of finding worse solutions. We evaluate the influence of various properties of the problem domain - problem class, reliability of fitness estimates, trajectory of fitness estimates, and evolutionary trajectory - to determine whether any is able to predict the effects of early termination. There is little correlation with any of these, with one exception. Boolean problems see little change in running time, and hence only small changes in performance, are distinguished by both problem class, and each of the other metrics.

parallel problem solving from nature | 2012

Analysing the effects of diverse operators in a genetic programming system

MinHyeok Kim; Bob McKay; Kangil Kim; Xuan Hoai Nguyen

Some Genetic Programming (GP) systems have fewer structural constraints than expression tree GP, permitting a wider range of operators. Using one such system, TAG3P, we compared the effects of such new operators with more standard ones on individual fitness, size and depth, comparing them on a number of symbolic regression and tree structuring problems. The operator effects were diverse, as the originators had claimed. The results confirm the overall primacy of crossover, but strongly suggest that new operators can usefully supplement, or even replace, subtree mutation. They give a better understanding of the features of each operator, and the contexts where it is likely to be useful. They illuminate the diverse effects of different operators, and provide justification for adaptive use of a range of operators.

congress on evolutionary computation | 2012

Implicit bias and recursive grammar structures in estimation of distribution genetic programming

Kangil Kim; Hoai Nguyen Xuan; Bob McKay

Much recent research in Estimation of Distribution Algorithms (EDA) applied to Genetic Programming has adopted a Stochastic Context Free Grammar(SCFG)-based model formalism. However these methods generate biases which may be indistinguishable from selection bias, resulting in sub-optimal performance. The primary factor generating this bias is the combined effect of recursion in the grammars and depth limitation removing some sample trees from the distribution. Here, we demonstrate the bias and provide exact estimates of its scale (assuming infinite populations and simple recursions). We define a quantity h which determines both whether bias occurs (h >; 1) and its scale. We apply this analysis to a number of simple illustrative grammars, and to a range of practically-used GP grammars, showing that this bias is both real and important.

genetic and evolutionary computation conference | 2011

Structural difficulty in estimation of distribution genetic programming

Kangil Kim; Min Hyeok Kim; Bob McKay

Estimation of Distribution Algorithms were introduced into Genetic Programming over 15 years ago, and have demonstrated good performance on a range of problems, but there has been little research into their limitations. We apply two such algorithms - scalar and vectorial Stochastic Grammar GP - to Daidas well-known Lid problem, to better understand their ability to learn specific structures. The scalar algorithm performs poorly, but the vectorial version shows good overall performance. We then extended Daidas problem to explore the vectorial algorithms ability to find even more specific structures, finding that the performance fell off rapidly as the specificity of the required structure increased. Thus although this particular system has less severe structural difficulty issues than standard GP, it is by no means free of them. Track: Genetic Programming

Applied Soft Computing | 2011

Investigating vesicular selection

Yun-Geun Lee; Bob McKay; Kangil Kim; Dongkyun Kim; Nguyen Xuan Hoai

Abstract: Directed protein evolution has led to major advances in organic chemistry, enabling the development of highly optimised proteins. The SELEX method has also been highly effective in evolving ribose nucleic acid (RNA) or deoxy-ribose nucleic acid (DNA) molecules; variants have been proposed which allow SELEX to be used in protein evolution. All of these methods can be viewed as evolutionary algorithms implemented in chemistry. A number of methods rely on selection of natural cells, or of artificial bubbles. These methods result in a new form of selection mechanism, which we call vesicular selection (VS). It is not, prima facie, clear whether VS is an effective selection mechanism, or how its performance is affected by changes in vesicle size. It is difficult to investigate this in vitro, so we use in silico methods derived from evolutionary computation. The primary aim is to test whether this selection method hinders biochemical evolutionary search (in which case, it might be worth investing research effort in discovering alternative selection methods). An in silico implementation of this selection method, embedded in an otherwise-typical evolutionary computation system, shows reasonable ability to solve tough optimisation problems, together with an acceptable ability to concentrate the solutions found. We compare it with tournament selection (TS), a standard evolutionary computation method, which can be finely tuned for high selection pressure, but only coarsely tuned for low selection pressure. By contrast, the new selection mechanism VS is highly tunable at low selection pressures. It is thus particularly suited to problem domains where extensive exploration capabilities are required. Since there is very good reason to believe that protein search spaces require highly exploratory search, the selection mechanism is well matched to its application in combinatorial chemistry.

Explore More