Mark van Heeswijk
Aalto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mark van Heeswijk.
Neurocomputing | 2011
Yoan Miche; Mark van Heeswijk; Patrick Bas; Olli Simula; Amaury Lendasse
In this paper an improvement of the optimally pruned extreme learning machine (OP-ELM) in the form of a L2 regularization penalty applied within the OP-ELM is proposed. The OP-ELM originally proposes a wrapper methodology around the extreme learning machine (ELM) meant to reduce the sensitivity of the ELM to irrelevant variables and obtain more parsimonious models thanks to neuron pruning. The proposed modification of the OP-ELM uses a cascade of two regularization penalties: first a L1 penalty to rank the neurons of the hidden layer, followed by a L2 penalty on the regression weights (regression between hidden layer and output layer) for numerical stability and efficient pruning of the neurons. The new methodology is tested against state of the art methods such as support vector machines or Gaussian processes and the original ELM and OP-ELM, on 11 different data sets; it systematically outperforms the OP-ELM (average of 27% better mean square error) and provides more reliable results – in terms of standard deviation of the results – while remaining always less than one order of magnitude slower than the OP-ELM.
Neurocomputing | 2011
Mark van Heeswijk; Yoan Miche; Erkki Oja; Amaury Lendasse
Abstract The paper presents an approach for performing regression on large data sets in reasonable time, using an ensemble of extreme learning machines (ELMs). The main purpose and contribution of this paper are to explore how the evaluation of this ensemble of ELMs can be accelerated in three distinct ways: (1) training and model structure selection of the individual ELMs are accelerated by performing these steps on the graphics processing unit (GPU), instead of the processor (CPU); (2) the training of ELM is performed in such a way that computed results can be reused in the model structure selection, making training plus model structure selection more efficient; (3) the modularity of the ensemble model is exploited and the process of model training and model structure selection is parallelized across multiple GPU and CPU cores, such that multiple models can be built at the same time. The experiments show that competitive performance is obtained on the regression tasks, and that the GPU-accelerated and parallelized ELM ensemble achieves attractive speedups over using a single CPU. Furthermore, the proposed approach is not limited to a specific type of ELM and can be employed for a large variety of ELMs.
international conference on artificial neural networks | 2009
Mark van Heeswijk; Yoan Miche; Tiina Lindh-Knuutila; Peter A. J. Hilbers; Timo Honkela; Erkki Oja; Amaury Lendasse
In this paper, we investigate the application of adaptive ensemble models of Extreme Learning Machines (ELMs) to the problem of one-step ahead prediction in (non)stationary time series. We verify that the method works on stationary time series and test the adaptivity of the ensemble model on a nonstationary time series. In the experiments, we show that the adaptive ensemble model achieves a test error comparable to the best methods, while keeping adaptivity. Moreover, it has low computational cost.
Neurocomputing | 2014
Rui Nian; Bo He; Bing Zheng; Mark van Heeswijk; Qi Yu; Yoan Miche; Amaury Lendasse
In this paper, we present one dynamic model hypothesis to perform fish trajectory tracking in the fish ethology research and develop the relevant mathematical criterion on the basis of the Extreme Learning Machine (ELM). It is shown that the proposed scheme can conduct the non-linear and non Gaussian tracking process by multiple historical cues and current predictions - the state vector motion, the color distribution and the appearance recognition, all of which can be extracted from the single-hidden layer feedforward neural network (SLFN) at diverse levels with ELM. The strategy of the hierarchical hybrid ELM ensemble then combines the individual SLFN of the tracking cues for the performance improvements. The simulation results have shown the excellent performance in both robustness and accuracy of the developed approach.
Cognitive Computation | 2013
Bo He; Dongxun Xu; Rui Nian; Mark van Heeswijk; Qi Yu; Yoan Miche; Amaury Lendasse
Most face recognition approaches developed so far regard the sparse coding as one of the essential means, while the sparse coding models have been hampered by the extremely expensive computational cost in the implementation. In this paper, a novel scheme for the fast face recognition is presented via extreme learning machine (ELM) and sparse coding. The common feature hypothesis is first introduced to extract the basis function from the local universal images, and then the single hidden layer feedforward network (SLFN) is established to simulate the sparse coding process for the face images by ELM algorithm. Some developments have been done to maintain the efficient inherent information embedding in the ELM learning. The resulting local sparse coding coefficient will then be grouped into the global representation and further fed into the ELM ensemble which is composed of a number of SLFNs for face recognition. The simulation results have shown the good performance in the proposed approach that could be comparable to the state-of-the-art techniques at a much higher speed.
Neurocomputing | 2014
Qi Yu; Mark van Heeswijk; Yoan Miche; Rui Nian; Bo He; Eric Séverin; Amaury Lendasse
Extreme learning machine (ELM) has shown its good performance in regression applications with a very fast speed. But there is still a difficulty to compromise between better generalization performance and smaller complexity of the ELM (a number of hidden nodes). This paper proposes a method called Delta Test-ELM (DT-ELM), which operates in an incremental way to create less complex ELM structures and determines the number of hidden nodes automatically. It uses Bayesian Information Criterion (BIC) as well as Delta Test (DT) to restrict the search as well as to consider the size of the network and prevent overfitting. Moreover, ensemble modeling is used on different DT-ELM models and it shows good test results in Experiments section.
international conference on artificial neural networks | 2013
Amaury Lendasse; Anton Akusok; Olli Simula; Francesco Corona; Mark van Heeswijk; Emil Eirola; Yoan Miche
In this paper is described the original (basic) Extreme Learning Machine (ELM). Properties like robustness and sensitivity to variable selection are studied. Several extensions of the original ELM are then presented and compared. Firstly, Tikhonov-Regularized Optimally-Pruned Extreme Learning Machine (TROP-ELM) is summarized as an improvement of the Optimally-Pruned Extreme Learning Machine (OP-ELM) in the form of a L2 regularization penalty applied within the OP-ELM. Secondly, a Methodology to Linearly Ensemble ELM (ELM-ELM) is presented in order to improve the performance of the original ELM. These methodologies (TROP-ELM and ELM-ELM) are tested against state of the art methods such as Support Vector Machines or Gaussian Processes and the original ELM and OP-ELM, on ten different data sets. A specific experiment to test the sensitivity of these methodologies to variable selection is also presented.
Neurocomputing | 2015
Mark van Heeswijk; Yoan Miche
In this paper, a new hidden layer construction method for Extreme Learning Machines (ELMs) is investigated, aimed at generating a diverse set of weights. The paper proposes two new ELM variants: Binary ELM, with a weight initialization scheme based on { 0 , 1 } -weights; and Ternary ELM, with a weight initialization scheme based on { - 1 , 0 , 1 } -weights. The motivation behind this approach is that these features will be from very different subspaces and therefore each neuron extracts more diverse information from the inputs than neurons with completely random features traditionally used in ELM. Therefore, ideally it should lead to better ELMs. Experiments show that indeed ELMs with ternary weights generally achieve lower test error. Furthermore, the experiments show that the Binary and Ternary ELMs are more robust to irrelevant and noisy variables and are in fact performing implicit variable selection. Finally, since only the weight generation scheme is adapted, the computational time of the ELM is unaffected, and the improved accuracy, added robustness and the implicit variable selection of Binary ELM and Ternary ELM come for free.
Entropy | 2014
Alberto Guillén; M. Isabel Garcia Arenas; Mark van Heeswijk; Dušan Sovilj; Amaury Lendasse; Luis Javier Herrera; Héctor Pomares; Ignacio Rojas
Feature or variable selection still remains an unsolved problem, due to the infeasible evaluation of all the solution space. Several algorithms based on heuristics have been proposed so far with successful results. However, these algorithms were not designed for considering very large datasets, making their execution impossible, due to the memory and time limitations. This paper presents an implementation of a genetic algorithm that has been parallelized using the classical island approach, but also considering graphic processing units to speed up the computation of the fitness function. Special attention has been paid to the population evaluation, as well as to the migration operator in the parallel genetic algorithm (GA), which is not usually considered too significant; although, as the experiments will show, it is crucial in order to obtain robust results.
Parallel Architectures and Bioinspired Algorithms | 2012
Alberto Guillén; Dušan Sovilj; Mark van Heeswijk; Luis Javier Herrera; Amaury Lendasse; Héctor Pomares; Ignacio Rojas
The design of a model to approximate a function relies significantly on the data used in the training stage. The problem of selecting an adequate set of variables should be treated carefully due to its importance. If the number of variables is high, the number of samples needed to design the model becomes too large and the interpretability of the model is lost. This chapter presents several methodologies to perform variable selection in a local or a globalmanner using a non-parametric noise estimator to determine the quality of a subset of variables. Several methods that apply parallel paradigms in different architecures are compared from the optimization and efficiency point of view since the problem is computationally expensive.