Luigi Malagò
Polytechnic University of Milan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Luigi Malagò.
foundations of genetic algorithms | 2011
Luigi Malagò; Matteo Matteucci; Giovanni Pistone
In this paper we present a geometrical framework for the analysis of Estimation of Distribution Algorithms (EDAs) based on the exponential family. From a theoretical point of view, an EDA can be modeled as a sequence of densities in a statistical model that converges towards distributions with reduced support. Under this framework, at each iteration the empirical mean of the fitness function decreases in probability, until convergence of the population. This is the context of stochastic relaxation, i.e., the idea of looking for the minima of a function by minimizing its expected value over a set of probability densities. Our main interest is in the study of the gradient of the expected value of the function to be minimized, and in particular on how its landscape changes according to the fitness function and the statistical model used in the relaxation. After introducing some properties of the exponential family, such as the description of its topological closure and of its tangent space, we provide a characterization of the stationary points of the relaxed problem, together with a study of the minimizing sequences with reduced support. The analysis developed in the paper aims to provide a theoretical understanding of the behavior of EDAs, and in particular their ability to converge to the global minimum of the fitness function. The theoretical results of this paper, beside providing a formal framework for the analysis of EDAs, lead to the definition of a new class algorithms for binary functions optimization based on Stochastic Natural Gradient Descent (SNGD), where the estimation of the parameters of the distribution is replaced by the direct update of the model parameters by estimating the natural gradient of the expected value of the fitness function.
congress on evolutionary computation | 2013
Luigi Malagò; Matteo Matteucci; Giovanni Pistone
The geometric framework based on Stochastic Relaxation allows to describe from a common perspective different model-based optimization algorithms that make use of statistical models to guide the search for the optimum. In this paper Stochastic Relaxation is used to provide theoretical results on Estimation of Distribution Algorithms (EDAs). By the use of Stochastic Relaxation we show how the estimation of the fitness model by least squares linear regression corresponds to the estimation of the natural gradient. This equivalence allows to simultaneously perform model selection and robust estimation of the natural gradient. Finally, we interpet Linear Programming relaxation as an example of Stochastic Relaxation, with respect to the regular gradient.
congress on evolutionary computation | 2010
Gabriele Valentini; Luigi Malagò; Matteo Matteucci
This paper presents Evolutionary Optimization Tool (Evoptool), an optimization toolkit that implements a set of meta-heuristics based on the Evolutionary Computation paradigm. Evoptool provides a common platform for the development and test of new algorithms, in order to facilitate the performance comparison activity. The toolkit offers a wide set of benchmark problems, from classical toy examples to complex tasks, and a collection of implementations of algorithms from the Genetic Algorithms and Estimation of Distribution Algorithms paradigms. Evoptool is flexible and easy to extend, also with algorithms based on other approaches that go beyond Evolutionary Computation.
congress on evolutionary computation | 2011
Luigi Malagò; Matteo Matteucci; Gabriele Valentini
Estimation of Distribution Algorithms evolve populations of candidate solutions to an optimization problem by introducing a statistical model, and by replacing classical variation operators of Genetic Algorithms with statistical operators, such as estimation and sampling. The choice of the model plays a key role in the evolutionary process, indeed it strongly affects the convergence to the global optimum. From this point of view, in a black-box context, especially when the interactions among variables in the objective function are sparse, it becomes fundamental for an EDA to choose the right model, able to encode such correlations. In this paper we focus on EDAs based on undirected graphical models, such as Markov Networks. To learn the topology of the graph we apply a sparse method based on ℓ1-regularized logistic regression, which has been demonstrated to be efficient in the high-dimensional case, i.e., when the number of observations is much smaller than the sample space. We propose a new algorithm within the DEUM framework, called DEUMℓ1, able to learn the interactions structure of the problem without the need of prior knowledge, and we compare its performance with other popular EDAs, over a set of well known benchmarks.
Entropy | 2014
Luigi Malagò; Giovanni Pistone
We discuss the use of the Newton method in the computation of max(p → Ep [f]), where p belongs to a statistical exponential family on a finite state space. In a number of papers, the authors have applied first order search methods based on information geometry. Second order methods have been widely used in optimization on manifolds, e.g., matrix manifolds, but appear to be new in statistical manifolds. These methods require the computation of the Riemannian Hessian in a statistical manifold. We use a non-parametric formulation of information geometry in view of further applications in the continuous state space cases, where the construction of a proper Riemannian structure is still an open problem.
genetic and evolutionary computation conference | 2008
Luigi Malagò; Matteo Matteucci; Bernardo Dal Seno
Estimation of Distribution Algorithms are a recent new meta-heuristic used in Genetics-Based Machine Learning to solve combinatorial and continuous optimization problems. One of the distinctive features of this family of algorithms is that the search for the optimum is performed within a candidate space of probability distributions associated to the problem rather than over the population of possible solutions. A framework based on Information Geometry [3] is applied in this paper to propose a geometrical interpretation of the different operators used in EDAs and provide a better understanding of the underlying behavior of this family of algorithms from a novel point of view. The analysis carried out and the simple examples introduced show the importance of the boundary of the statistical model w.r.t. the distributions and EDA may converge to.
International Conference on Geometric Science of Information | 2013
Luigi Malagò; Matteo Matteucci
We are interested in the optimization of the expected value of a function by following a steepest descent policy over a statistical model. Such approach appears in many different model-based search meta-heuristics for optimization, for instance in the large class of random search methods in stochastic optimization and Evolutionary Computation. We study the case when statistical models belong to the exponential family and the direction of maximum decrement of the expected value is given by the natural gradient evaluated with respect to the Fisher Information metric. When the gradient cannot be computed exactly, a robust estimation allows to minimize the number of function evaluations required to obtain convergence to the global optimum. Under the choice of centered sufficient statistics, the estimation of the natural gradient corresponds to solving a least squares regression problem for the original function to be optimized. The correspondence between the estimation of the natural gradient and solving a linear regression problem leads to the definition of regularized versions of the natural gradient. We propose a robust estimation of the natural gradient for the exponential family based on regularized least squares.
learning and intelligent optimization | 2012
Gabriele Valentini; Luigi Malagò; Matteo Matteucci
When the function to be optimized is characterized by a limited and unknown number of interactions among variables, a context that applies to many real world scenario, it is possible to design optimization algorithms based on such information. Estimation of Distribution Algorithms learn a set of interactions from a sample of points and encode them in a probabilistic model. The latter is then used to sample new instances. In this paper, we propose a novel approach to estimate the Markov Fitness Model used in DEUM. We combine model selection and model fitting by solving an l1-constrained linear regression problem. Since candidate interactions grow exponentially in the size of the problem, we first reduce this set with a preliminary coarse selection criteria based on Mutual Information. Then, we employ l1-regularization to further enforce sparsity in the model, estimating its parameters at the same time. Our proposal is analyzed against the 3D Ising Spin Glass function, a problem known to be NP-hard, and it outperforms other popular black-box meta-heuristics.
learning and intelligent optimization | 2012
Emanuele Corsano; Davide Antonio Cucci; Luigi Malagò; Matteo Matteucci
In this paper we address the problem of model selection in Estimation of Distribution Algorithms from a novel perspective. We perform an implicit model selection by transforming the variables and choosing a low dimensional model in the new variable space. We apply such paradigm in EDAs and we introduce a novel algorithm called I-FCA, which makes use of the independence model in the transformed space, yet being able to recover higher order interactions among the original variables. We evaluated the performance of the algorithm on well known benchmarks functions in a black-box context and compared with other popular EDAs.
genetic and evolutionary computation conference | 2014
Luigi Malagò; Tobias Glasmachers
In recent years there have been independent developments in multiple branches of Evolutionary Computation (EC) that interpret population-based and model-based search algorithms in terms of information geometric concepts. This trend has resulted in the development of novel algorithms and in improved understanding of existing ones. This tutorial aims at making this new line of research accessible to a broader range of researchers. A statistical model, identified by a parametric family of distributions, is equipped with an intrinsic (Riemannian) geometry, the so-called information geometry. From this perspective, a statistical model is a manifold of distributions where the inner product is given by the Fisher information metric. Any evolutionary algorithm that implicitly or explicitly evolves the parameters of a search distribution defines a dynamic over the manifold. Taking into account the Riemannian geometry of the new search space given by the search distributions allows for the description and analysis of evolutionary operators in a new light. Notably, this framework can be used for the study of optimization algorithms. A core idea of several recent and novel heuristics, both in the continuous and the discrete domain, such as Estimation of Distribution Algorithms (EDAs) and Natural Evolution Strategies (NESs), is to perform stochastic gradient descent directly on the space of search distributions. However the definition of gradient depends on the metric, which is why it becomes fundamental to consider the information geometry of the space of search distributions. Despite being equivalent to classical gradient-based methods for a stochastically relaxed problem the approach performs randomized direct search on the original search space: the generation of an offspring population as well as selection and strategy adaptation turn out to implicitly sample a search distribution in a statistical model and to perform a stochastic gradient step in the direction of the natural gradient. Particular strengths of the information geometric framework are its ability to unify optimization in discrete and continuous domains as well as the traditionally separate processes of optimization and strategy parameter adaptation. Respecting the intrinsic information geometry automatically results in powerful invariance principles. The framework can be seen as an analysis toolbox for existing methods, as well as a generic design principle for novel algorithms. This tutorial will introduce from scratch the mathematical concept of information geometry to the EC community. It will transport not only rigorous definitions but also geometric intuition on Riemannian geometry, information geometry, natural gradient, and stochastic gradient descent. Stochastic relaxations of EC problems will act as a glue. The framework will be made accessible with applications to basic as well as state-of-the-art algorithms operating on discrete and continuous domains.