Ekaterina Vladislavleva
Tilburg University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ekaterina Vladislavleva.
IEEE Transactions on Evolutionary Computation | 2009
Ekaterina Vladislavleva; Guido Smits; Dick den Hertog
This paper presents a novel approach to generate data-driven regression models that not only give reliable prediction of the observed data but also have smoother response surfaces and extra generalization capabilities with respect to extrapolation. These models are obtained as solutions of a genetic programming (GP) process, where selection is guided by a tradeoff between two competing objectives - numerical accuracy and the order of nonlinearity. The latter is a novel complexity measure that adopts the notion of the minimal degree of the best-fit polynomial, approximating an analytical function with a certain precision. Using nine regression problems, this paper presents and illustrates two different strategies for the use of the order of nonlinearity in symbolic regression via GP. The combination of optimization of the order of nonlinearity together with the numerical accuracy strongly outperforms ldquoconventionalrdquo optimization of a size-related expressional complexity and the accuracy with respect to extrapolative capabilities of solutions on all nine test problems. In addition to exploiting the new complexity measure, this paper also introduces a novel heuristic of alternating several optimization objectives in a 2-D optimization framework. Alternating the objectives at each generation in such a way allows us to exploit the effectiveness of 2-D optimization when more than two objectives are of interest (in this paper, these are accuracy, expressional complexity, and the order of nonlinearity). Results of the experiments on all test problems suggest that alternating the order of nonlinearity of GP individuals with their structural complexity produces solutions that are both compact and have smoother response surfaces, and, hence, contributes to better interpretability and understanding.
Archive | 2008
Mark Kotanchek; Guido Smits; Ekaterina Vladislavleva
Trust is a major issue with deploying empirical models in the real world since changes in the underlying system or use of the model in new regions of parameter space can produce (potentially dangerous) incorrect predictions. The trepidation involved with model usage can be mitigated by assembling ensembles of diverse models and using their consensus as a trust metric, since these models will be constrained to agree in the data region used for model development and also constrained to disagree outside that region. The problem is to define an appropriate model complexity (since the ensemble should consist of models of similar complexity), as well as to identify diverse models from the candidate model set. In this chapter we discuss strategies for the development and selection of robust models and model ensembles and demonstrate those strategies against industrial data sets. An important benefit of this approach is that all available data may be used in the model development rather than a partition into training, test and validation subsets. The result is constituent models are more accurate without risk of over-fitting, the ensemble predictions are more accurate and the ensemble predictions have a meaningful trust metric.
IEEE Transactions on Circuits and Systems for Video Technology | 2013
Nicolas Staelens; Dirk Deschrijver; Ekaterina Vladislavleva; Brecht Vermeulen; Tom Dhaene; Piet Demeester
In order to ensure optimal quality of experience toward end users during video streaming, automatic video quality assessment becomes an important field-of-interest to video service providers. Objective video quality metrics try to estimate perceived quality with high accuracy and in an automated manner. In traditional approaches, these metrics model the complex properties of the human visual system. More recently, however, it has been shown that machine learning approaches can also yield competitive results. In this paper, we present a novel no-reference bitstream-based objective video quality metric that is constructed by genetic programming-based symbolic regression. A key benefit of this approach is that it calculates reliable white-box models that allow us to determine the importance of the parameters. Additionally, these models can provide human insight into the underlying principles of subjective video quality assessment. Numerical results show that perceived quality can be modeled with high accuracy using only parameters extracted from the received video bitstream.
Archive | 2007
Mark Kotanchek; Guido Smits; Ekaterina Vladislavleva
The ParetoGP algorithm which adopts a multi-objective optimization approach to balancing expression complexity and accuracy has proven to have significant impact on symbolic regression of industrial data due to its improvement in speed and quality of model development as well as user model selection, (Smits and Kotanchek, 2004), (Smits et al., 2005), (Castillo et al., 2006). In this chapter, we explore a range of topics related to exploiting the Pareto paradigm. First we describe and explore the strengths and weaknesses of the ClassicGPand Pareto-Front GP variants for symbolic regression as well as touch on related algorithms. Next, we show a derivation for the selection intensity of tournament selection with multiple winners (albeit, in a single-objective case). We then extend classical tournament and elite selection strategies into a multi-objective framework which allows classical GP schemes to be readily Pareto-aware. Finally, we introduce the latest extension of the Pareto paradigm which is the melding with ordinal optimization. It appears that ordinal optimization will provide a theoretical foundation to guide algorithm design. Application of these insights has already produced at least a four-fold improvement in the ParetoGP performance for a suite of test problems.
ieee international conference on evolutionary computation | 2006
Guido Smits; Ekaterina Vladislavleva
This paper introduces the first attempt to combine the theory of ordinal optimization and symbolic regression via genetic programming. A new approach called ordinal ParetoGP allows obtaining considerably fitter solutions with more consistency between independent runs while spending less computational effort. The conclusions are supported by a number of experiments using three symbolic regression benchmark problems of various size.
Archive | 2010
Rick L. Riolo; Trent McConaghy; Ekaterina Vladislavleva
The contributions in this volume are written by the foremost international researchers and practitioners in the GP arena. They examine the similarities and differences between theoretical and empirical results on real-world problems. The text explores the synergy between theory and practice, producing a comprehensive view of the state of the art in GP application. Topics include: FINCH: A System for Evolving Java, Practical Autoconstructive Evolution, The Rubik Cube and GP Temporal Sequence Learning, Ensemble classifiers: AdaBoost and Orthogonal Evolution of Teams, Self-modifying Cartesian GP, Abstract Expression Grammar Symbolic Regression, Age-Fitness Pareto Optimization, Scalable Symbolic Regression by Continuous Evolution, Symbolic Density Models, GP Transforms in Linear Regression Situations, Protein Interactions in a Computational Evolution System, Composition of Music and Financial Strategies via GP, and Evolutionary Art Using Summed Multi-Objective Ranks. Readers will discover large-scale, real-world applications of GP to a variety of problem domains via in-depth presentations of the latest and most significant results in GP .
IEEE Transactions on Evolutionary Computation | 2010
Ekaterina Vladislavleva; Guido Smits; Dick den Hertog
Symbolic regression of input-output data conventionally treats data records equally. We suggest a framework for automatic assignment of weights to data samples, which takes into account the samples relative importance. In this paper, we study the possibilities of improving symbolic regression on real-life data by incorporating weights into the fitness function. We introduce four weighting schemes defining the importance of a point relative to proximity, surrounding, remoteness, and nonlinear deviation from k nearest-in-the-input-space neighbors. For enhanced analysis and modeling of large imbalanced data sets we introduce a simple multidimensional iterative technique for subsampling. This technique allows a sensible partitioning (and compression) of data to nested subsets of an arbitrary size in such a way that the subsets are balanced with respect to either of the presented weighting schemes. For cases where a given input-output data set contains some redundancy, we suggest an approach to considerably improve the effectiveness of regression by applying more modeling effort to a smaller subset of the data set that has a similar information content. Such improvement is achieved due to better exploration of the search space of potential solutions at the same number of function evaluations. We compare different approaches to regression on five benchmark problems with a fixed budget allocation. We demonstrate that the significant improvement in the quality of the regression models can be obtained either with the weighted regression, exploratory regression using a compressed subset with a similar information content, or exploratory weighted regression on the compressed subset, which is weighted with one of the proposed weighting schemes.
Archive | 2013
Rick L. Riolo; Ekaterina Vladislavleva; Marylyn D. Ritchie; Jason H. Moore
These contributions, written by the foremost international researchers and practitioners of Genetic Programming (GP), explore the synergy between theoretical and empirical results on real-world problems, producing a comprehensive view of the state of the art in GP. Topics in this volume include: evolutionary constraints, relaxation of selection mechanisms, diversity preservation strategies, flexing fitness evaluation, evolution in dynamic environments, multi-objective and multi-modal selection, foundations of evolvability, evolvable and adaptive evolutionary operators, foundation of injecting expert knowledge in evolutionary search, analysis of problem difficulty and required GP algorithm complexity, foundations in running GP on the cloud communication, cooperation, flexible implementation, and ensemble methods. Additional focal points for GP symbolic regression are: (1) The need to guarantee convergence to solutions in the function discovery mode; (2) Issues on model validation; (3) The need for model analysis workflows for insight generation based on generated GP solutions model exploration, visualization, variable selection, dimensionality analysis; (4) Issues in combining different types of data. Readers will discover large-scale, real-world applications of GP to a variety of problem domains via in-depth presentations of the latest and most significant results.
Archive | 2011
Guido Smits; Ekaterina Vladislavleva; Mark Kotanchek
The future of computing is one of massive parallelism. To exploit this and generatemaximumperformance itwill be inevitable thatmore co-design between hardware and software takes place. Many software algorithms need rethinking to expose all the possible concurrency, increase locality and have built-in fault tolerance. Evolutionary algorithms are naturally parallel and should as such have an edge in exploiting these hardware features.
PLOS Computational Biology | 2014
Lander Willem; Sean Stijven; Ekaterina Vladislavleva; Jan Broeckhove; Philippe Beutels; Niel Hens
Modeling plays a major role in policy making, especially for infectious disease interventions but such models can be complex and computationally intensive. A more systematic exploration is needed to gain a thorough systems understanding. We present an active learning approach based on machine learning techniques as iterative surrogate modeling and model-guided experimentation to systematically analyze both common and edge manifestations of complex model runs. Symbolic regression is used for nonlinear response surface modeling with automatic feature selection. First, we illustrate our approach using an individual-based model for influenza vaccination. After optimizing the parameter space, we observe an inverse relationship between vaccination coverage and cumulative attack rate reinforced by herd immunity. Second, we demonstrate the use of surrogate modeling techniques on input-response data from a deterministic dynamic model, which was designed to explore the cost-effectiveness of varicella-zoster virus vaccination. We use symbolic regression to handle high dimensionality and correlated inputs and to identify the most influential variables. Provided insight is used to focus research, reduce dimensionality and decrease decision uncertainty. We conclude that active learning is needed to fully understand complex systems behavior. Surrogate models can be readily explored at no computational expense, and can also be used as emulator to improve rapid policy making in various settings.