Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mario Graff is active.

Publication


Featured researches published by Mario Graff.


european conference on genetic programming | 2009

There Is a Free Lunch for Hyper-Heuristics, Genetic Programming and Computer Scientists

Riccardo Poli; Mario Graff

In this paper we prove that in some practical situations, there is a free lunch for hyper-heuristics, i.e., for search algorithms that search the space of solvers, searchers, meta-heuristics and heuristics for problems. This has consequences for the use of genetic programming as a method to discover new search algorithms and, more generally, problem solvers. Furthermore, it has also rather important philosophical consequences in relation to the efforts of computer scientists to discover useful novel search algorithms.


Artificial Intelligence | 2010

Practical performance models of algorithms in evolutionary program induction and other domains

Mario Graff; Riccardo Poli

Evolutionary computation techniques have seen a considerable popularity as problem solving and optimisation tools in recent years. Theoreticians have developed a variety of both exact and approximate models for evolutionary program induction algorithms. However, these models are often criticised for being only applicable to simplistic problems or algorithms with unrealistic parameters. In this paper, we start rectifying this situation in relation to what matters the most to practitioners and users of program induction systems: performance. That is, we introduce a simple and practical model for the performance of program-induction algorithms. To test our approach, we consider two important classes of problems - symbolic regression and Boolean function induction - and we model different versions of genetic programming, gene expression programming and stochastic iterated hill climbing in program space. We illustrate the generality of our technique by also accurately modelling the performance of a training algorithm for artificial neural networks and two heuristics for the off-line bin packing problem.We show that our models, besides performing accurate predictions, can help in the analysis and comparison of different algorithms and/or algorithms with different parameters setting. We illustrate this via the automatic construction of a taxonomy for the stochastic program-induction algorithms considered in this study. The taxonomy reveals important features of these algorithms from the performance point of view, which are not detected by ordinary experimentation.


Information Systems | 2015

Near neighbor searching with K nearest references

Edgar Chávez; Mario Graff; Gonzalo Navarro; Eric Sadit Tellez

Proximity searching is the problem of retrieving, from a given database, those objects closest to a query. To avoid exhaustive searching, data structures called indexes are built on the database prior to serving queries. The curse of dimensionality is a well-known problem for indexes: in spaces with sufficiently concentrated distance histograms, no index outperforms an exhaustive scan of the database.In recent years, a number of indexes for approximate proximity searching have been proposed. These are able to cope with the curse of dimensionality in exchange for returning an answer that might be slightly different from the correct one.In this paper we show that many of those recent indexes can be understood as variants of a simple general model based on K-nearest reference signatures. A set of references is chosen from the database, and the signature of each object consists of the K references nearest to the object. At query time, the signature of the query is computed and the search examines only the objects whose signature is close enough to that of the query.Many known and novel indexes are obtained by considering different ways to determine how much detail the signature records (e.g., just the set of nearest references, or also their proximity order to the object, or also their distances to the object, and so on), how the similarity between signatures is defined, and how the parameters are tuned. In addition, we introduce a space-efficient representation for those families of indexes, making it possible to search very large databases in main memory. Small indexes are cache friendly, inducing faster queries.We perform exhaustive experiments comparing several known and new indexes that derive from our framework, evaluating their time performance, memory usage, and quality of approximation. The best indexes outperform the state of the art, offering an attractive balance between all these aspects, and turn out to be excellent choices in many scenarios. Our framework gives high flexibility to design new indexes. HighlightsA general framework to understand and analyze many recent indexes.A space efficient representation based on succinct data structures.An exhaustive experimentation comparing several known and new indexes derived from the framework.The possibility to design new indexes, which can be implemented in a similar way using our framework.


foundations of genetic algorithms | 2009

Free lunches for function and program induction

Riccardo Poli; Mario Graff; Nicholas Freitag McPhee

In this paper we prove that for a variety of practical problems and representations, there is a free lunch for search algorithms that specialise in the task of finding functions or programs that solve problems, such as genetic programming. In other words, not all such algorithms are equally good under all possible performance measures. We focus in particular on the case where the objective is to discover functions that fit sets of data-points - a task that we will call symbolic regression. We show under what conditions there is a free lunch for symbolic regression, highlighting that these are extremely restrictive.


Knowledge Based Systems | 2015

Term-weighting learning via genetic programming for text classification

Hugo Jair Escalante; Mauricio García-Limón; Alicia Morales-Reyes; Mario Graff; Manuel Montes-y-Gómez; Eduardo F. Morales; José Martínez-Carranza

A new method for learning term-weighting schemes is proposed.A genetic program searches for the scheme that maximizes classification performance.The method is evaluated in text and image categorization, and authorship attribution. This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been traditionally an art. Further, it is still a difficult task to determine what is the best TWS for a particular problem and it is not clear yet, whether better schemes, than those currently available, can be generated by combining known TWS. We propose in this article a genetic program that aims at learning effective TWSs that can improve the performance of current schemes in text classification. The genetic program learns how to combine a set of basic units to give rise to discriminative TWSs. We report an extensive experimental study comprising data sets from thematic and non-thematic text classification as well as from image classification. Our study shows the validity of the proposed method; in fact, we show that TWSs learned with the genetic program outperform traditional schemes and other TWSs proposed in recent works. Further, we show that TWSs learned from a specific domain can be effectively used for other tasks.


congress on evolutionary computation | 2013

Wind speed forecasting using genetic programming

Mario Graff; Rafael Peña; Aurelio Medina

This contribution presents the application of genetic programming to the problem of time series forecasting. This forecast technique is applied to wind speed time series. The results obtained from the forecasting are used to determine the power generation capacity of a fixed-speed wind turbine, which includes a squirrel cage induction generator. The forecast values obtained with the genetic programming are compared against the original time series data in order to show the precision of this forecast technique.


Neurocomputing | 2013

Models of performance of time series forecasters

Mario Graff; Hugo Jair Escalante; Jaime Cerda-Jacobo; Alberto Avalos Gonzalez

One of the first steps when approaching any machine learning task is to select, among all the available procedures, which one is the most adequate to solve a particular problem; in automated problem solving this is known as the algorithm selection problem. Of course, this problem is also present in the field of time series forecasting, there, one needs to select the forecaster that makes the most accurate predictions. Generally, this selection task is manually performed by analyzing the characteristics of the time series, thus relying on the expertise that one has on the available forecasters. In this paper, we propose an automatic procedure to choose a forecaster given a set of candidates, i.e., to solve the algorithm selection problem on this domain. To do so, we follow two paths. Firstly, we propose to model the performance of the forecasters using a linear combination of features that were previously used to assess the problem difficulty of evolutionary algorithms, together with a set of features we propose in this paper. Then, this model is used to predict the performance of the forecasters and based on these predictions the forecaster is selected. Our second approach is to treat this algorithm selection process as a classification task where the descriptors of each time series are the proposed features. To show the capabilities of our approach, we test the forecasters on the time series of the M1 and M3 time series competitions and used three different forecasters. In all the cases tested, our proposals outperform the performance of the three forecasters indicating the viability of our approach.


Evolutionary Computation | 2013

Models of performance of evolutionary program induction algorithms based on indicators of problem difficulty

Mario Graff; Riccardo Poli; Juan J. Flores

Modeling the behavior of algorithms is the realm of evolutionary algorithm theory. From a practitioners point of view, theory must provide some guidelines regarding which algorithm/parameters to use in order to solve a particular problem. Unfortunately, most theoretical models of evolutionary algorithms are difficult to apply to realistic situations. However, in recent work (Graff and Poli, 2008, 2010), where we developed a method to practically estimate the performance of evolutionary program-induction algorithms (EPAs), we started addressing this issue. The method was quite general; however, it suffered from some limitations: it required the identification of a set of reference problems, it required hand picking a distance measure in each particular domain, and the resulting models were opaque, typically being linear combinations of 100 features or more. In this paper, we propose a significant improvement of this technique that overcomes the three limitations of our previous method. We achieve this through the use of a novel set of features for assessing problem difficulty for EPAs which are very general, essentially based on the notion of finite difference. To show the capabilities or our technique and to compare it with our previous performance models, we create models for the same two important classes of problems—symbolic regression on rational functions and Boolean function induction—used in our previous work. We model a variety of EPAs. The comparison showed that for the majority of the algorithms and problem classes, the new method produced much simpler and more accurate models than before. To further illustrate the practicality of the technique and its generality (beyond EPAs), we have also used it to predict the performance of both autoregressive models and EPAs on the problem of wind speed forecasting, obtaining simpler and more accurate models that outperform in all cases our previous performance models.


european conference on genetic programming | 2008

Practical model of genetic programming's performance on rational symbolic regression problems

Mario Graff; Riccardo Poli

Many theoretical studies on GP are criticized for not being applicable to the real world. Here we present a practical model for the performance of a standard GP system in real problems. The model gives accurate predictions and has a variety of applications, including the assessment of the similarities and differences of different GP systems.


iberian conference on pattern recognition and image analysis | 2013

Genetic Programming of Prototypes for Pattern Classification

Hugo Jair Escalante; Karlo Mendoza; Mario Graff; Alicia Morales-Reyes

This paper introduces a genetic programming approach to the generation of classification prototypes. Prototype-based classification is a pattern recognition methodology in which the training set of a classification problem is represented by a small subset of instances. The assignment of labels to test instances is usually done by a 1NN rule. We propose a new prototype generation method, based on genetic programming, in which examples of each class are automatically combined to generate highly effective classification prototypes. The genetic program aims to maximize an estimate of the generalization performance of a 1NN classifier using the prototypes. We report experimental results on a benchmark for the evaluation of prototype generation methods. Experimental results show the validity of our approach: the proposed method outperforms most of the state of the art techniques when using both small and large data sets. Better results are obtained for data sets with numeric attributes only, although the performance of our method on mixed data is very competitive as well.

Collaboration


Dive into the Mario Graff's collaboration.

Top Co-Authors

Avatar

Eric Sadit Tellez

Universidad Michoacana de San Nicolás de Hidalgo

View shared research outputs
Top Co-Authors

Avatar

Sabino Miranda-Jiménez

Consejo Nacional de Ciencia y Tecnología

View shared research outputs
Top Co-Authors

Avatar

Daniela Moctezuma

Consejo Nacional de Ciencia y Tecnología

View shared research outputs
Top Co-Authors

Avatar

Hugo Jair Escalante

National Institute of Astrophysics

View shared research outputs
Top Co-Authors

Avatar

Juan J. Flores

Universidad Michoacana de San Nicolás de Hidalgo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

José Ortiz-Bejar

Universidad Michoacana de San Nicolás de Hidalgo

View shared research outputs
Top Co-Authors

Avatar

Jaime Cerda

Universidad Michoacana de San Nicolás de Hidalgo

View shared research outputs
Top Co-Authors

Avatar

Alicia Morales-Reyes

National Institute of Astrophysics

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge