Simone Scardapane | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Simone Scardapane is active.

Explore More

Publication

Featured researches published by Simone Scardapane.

Information Sciences | 2015

Distributed learning for Random Vector Functional-Link networks

Simone Scardapane; Dianhui Wang; Massimo Panella; Aurelio Uncini

This paper aims to develop distributed learning algorithms for Random Vector Functional-Link (RVFL) networks, where training data is distributed under a decentralized information structure. Two algorithms are proposed by using Decentralized Average Consensus (DAC) and Alternating Direction Method of Multipliers (ADMM) strategies, respectively. These algorithms work in a fully distributed fashion and have no requirement on coordination from a central agent during the learning process. For distributed learning, the goal is to build a common learner model which optimizes the system performance over the whole set of local data. In this work, it is assumed that all stations know the initial weights of the input layer, the output weights of local RVFL networks can be shared through communication channels among neighboring nodes only, and local datasets are blocked strictly. The proposed learning algorithms are evaluated over five benchmark datasets. Experimental results with comparisons show that the DAC-based learning algorithm performs favorably in terms of effectiveness, efficiency and computational complexity, followed by the ADMM-based learning algorithm with promising accuracy but higher computational burden.

IEEE Transactions on Neural Networks | 2015

Online Sequential Extreme Learning Machine With Kernels

Simone Scardapane; Danilo Comminiello; Michele Scarpiniti; Aurelio Uncini

The extreme learning machine (ELM) was recently proposed as a unifying framework for different families of learning algorithms. The classical ELM model consists of a linear combination of a fixed number of nonlinear expansions of the input vector. Learning in ELM is hence equivalent to finding the optimal weights that minimize the error on a dataset. The update works in batch mode, either with explicit feature mappings or with implicit mappings defined by kernels. Although an online version has been proposed for the former, no work has been done up to this point for the latter, and whether an efficient learning algorithm for online kernel-based ELM exists remains an open problem. By explicating some connections between nonlinear adaptive filtering and ELM theory, in this brief, we present an algorithm for this task. In particular, we propose a straightforward extension of the well-known kernel recursive least-squares, belonging to the kernel adaptive filtering (KAF) family, to the ELM framework. We call the resulting algorithm the kernel online sequential ELM (KOS-ELM). Moreover, we consider two different criteria used in the KAF field to obtain sparse filters and extend them to our context. We show that KOS-ELM, with their integration, can result in a highly efficient algorithm, both in terms of obtained generalization error and training time. Empirical evaluations demonstrate interesting results on some benchmarking datasets.

Neural Networks | 2016

A decentralized training algorithm for Echo State Networks in distributed big data applications

Simone Scardapane; Dianhui Wang; Massimo Panella

The current big data deluge requires innovative solutions for performing efficient inference on large, heterogeneous amounts of information. Apart from the known challenges deriving from high volume and velocity, real-world big data applications may impose additional technological constraints, including the need for a fully decentralized training architecture. While several alternatives exist for training feed-forward neural networks in such a distributed setting, less attention has been devoted to the case of decentralized training of recurrent neural networks (RNNs). In this paper, we propose such an algorithm for a class of RNNs known as Echo State Networks. The algorithm is based on the well-known Alternating Direction Method of Multipliers optimization procedure. It is formulated only in terms of local exchanges between neighboring agents, without reliance on a coordinating node. Additionally, it does not require the communication of training patterns, which is a crucial component in realistic big data implementations. Experimental results on large scale artificial datasets show that it compares favorably with a fully centralized implementation, in terms of speed, efficiency and generalization accuracy.

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2017

Randomness in neural networks: an overview

Simone Scardapane; Dianhui Wang

Neural networks, as powerful tools for data mining and knowledge engineering, can learn from data to build feature‐based classifiers and nonlinear predictive models. Training neural networks involves the optimization of nonconvex objective functions, and usually, the learning process is costly and infeasible for applications associated with data streams. A possible, albeit counterintuitive, alternative is to randomly assign a subset of the networks’ weights so that the resulting optimization task can be formulated as a linear least‐squares problem. This methodology can be applied to both feedforward and recurrent networks, and similar techniques can be used to approximate kernel functions. Many experimental results indicate that such randomized models can reach sound performance compared to fully adaptable ones, with a number of favorable benefits, including (1) simplicity of implementation, (2) faster learning with less intervention from human beings, and (3) possibility of leveraging overall linear regression and classification algorithms (e.g., ℓ 1 norm minimization for obtaining sparse formulations). This class of neural networks attractive and valuable to the data mining community, particularly for handling large scale data mining in real‐time. However, the literature in the field is extremely vast and fragmented, with many results being reintroduced multiple times under different names. This overview aims to provide a self‐contained, uniform introduction to the different ways in which randomization can be applied to the design of neural networks and kernel functions. A clear exposition of the basic framework underlying all these approaches helps to clarify innovative lines of research, open problems, and most importantly, foster the exchanges of well‐known results throughout different communities. WIREs Data Mining Knowl Discov 2017, 7:e1200. doi: 10.1002/widm.1200

Neurocomputing | 2017

Group sparse regularization for deep neural networks

Simone Scardapane; Danilo Comminiello; Amir Hussain; Aurelio Uncini

In this paper, we address the challenging task of simultaneously optimizing (i) the weights of a neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are traditionally dealt with separately, we propose an efficient regularized formulation enabling their simultaneous parallel execution, using standard optimization routines. Specifically, we extend the group Lasso penalty, originally proposed in the linear regression literature, to impose group-level sparsity on the networks connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We carry out an extensive experimental evaluation, in comparison with classical weight decay and Lasso penalties, both on a toy dataset for handwritten digit recognition, and multiple realistic mid-scale classification benchmarks. Comparative results demonstrate the potential of our proposed sparse group Lasso penalty in producing extremely compact networks, with a significantly lower number of input features, with a classification accuracy which is equal or only slightly inferior to standard regularization terms.

Neural Networks | 2015

Prediction of telephone calls load using Echo State Network with exogenous variables

Filippo Maria Bianchi; Simone Scardapane; Aurelio Uncini; Antonello Rizzi; Alireza Sadeghian

We approach the problem of forecasting the load of incoming calls in a cell of a mobile network using Echo State Networks. With respect to previous approaches to the problem, we consider the inclusion of additional telephone records regarding the activity registered in the cell as exogenous variables, by investigating their usefulness in the forecasting task. Additionally, we analyze different methodologies for training the readout of the network, including two novel variants, namely ν-SVR and an elastic net penalty. Finally, we employ a genetic algorithm for both the tasks of tuning the parameters of the system and for selecting the optimal subset of most informative additional time-series to be considered as external inputs in the forecasting problem. We compare the performances with standard prediction models and we evaluate the results according to the specific properties of the considered time-series.

Neural Networks | 2016

Distributed semi-supervised support vector machines

Simone Scardapane; Roberto Fierimonte; Paolo Di Lorenzo; Massimo Panella; Aurelio Uncini

The semi-supervised support vector machine (S(3)VM) is a well-known algorithm for performing semi-supervised inference under the large margin principle. In this paper, we are interested in the problem of training a S(3)VM when the labeled and unlabeled samples are distributed over a network of interconnected agents. In particular, the aim is to design a distributed training protocol over networks, where communication is restricted only to neighboring agents and no coordinating authority is present. Using a standard relaxation of the original S(3)VM, we formulate the training problem as the distributed minimization of a non-convex social cost function. To find a (stationary) solution in a distributed manner, we employ two different strategies: (i) a distributed gradient descent algorithm; (ii) a recently developed framework for In-Network Nonconvex Optimization (NEXT), which is based on successive convexifications of the original problem, interleaved by state diffusion steps. Our experimental results show that the proposed distributed algorithms have comparable performance with respect to a centralized implementation, while highlighting the pros and cons of the proposed solutions. To the date, this is the first work that paves the way toward the broad field of distributed semi-supervised learning over networks.

international symposium on neural networks | 2015

Distributed music classification using Random Vector Functional-Link nets

Simone Scardapane; Roberto Fierimonte; Dianhui Wang; Massimo Panella; Aurelio Uncini

In this paper, we investigate the problem of music classification when training data is distributed throughout a network of interconnected agents (e.g. computers, or mobile devices), and it is available in a sequential stream. Under the considered setting, the task is for all the nodes, after receiving any new chunk of training data, to agree on a single classifier in a decentralized fashion, without reliance on a master node. In particular, in this paper we propose a fully decentralized, sequential learning algorithm for a class of neural networks known as Random Vector Functional-Link nets. The proposed algorithm does not require the presence of a single coordinating agent, and it is formulated exclusively in term of local exchanges between neighboring nodes, thus making it useful in a wide range of realistic situations. Experimental simulations on four music classification benchmarks show that the algorithm has comparable performance with respect to a centralized solution, where a single agent collects all the local data from every node and subsequently updates the model.

Information Sciences | 2016

A semi-supervised random vector functional-link network based on the transductive framework

Simone Scardapane; Danilo Comminiello; Michele Scarpiniti; Aurelio Uncini

Semi-supervised learning (SSL) is the problem of learning a function with only a partially labeled training set. It has considerable practical interest in applications where labeled data is costly to obtain, while unlabeled data is abundant. One approach to SSL in the case of binary classification is inspired by work on transductive learning (TL) by Vapnik. It has been applied prevalently using support vector machines (SVM) as the base learning algorithm, giving rise to the so-called transductive SVM (TR-SVM). The resulting optimization problem, however, is highly non-convex and complex to solve. In this paper, we propose an alternative semi-supervised training algorithm based on the TL theory, namely semi-supervised random vector functional-link (RVFL) network, which is able to obtain state-of-the-art performance, while resulting in a standard convex optimization problem. In particular we show that, thanks to the characteristics of RVFLs networks, the resulting optimization problem can be safely approximated with a standard quadratic programming problem solvable in polynomial time. A wide range of experiments validate our proposal. As a comparison, we also propose a semi-supervised algorithm for RVFLs based on the theory of manifold regularization.

Cognitive Computation | 2017

Semi-supervised Echo State Networks for Audio Classification

Simone Scardapane; Aurelio Uncini

Echo state networks (ESNs), belonging to the wider family of reservoir computing methods, are a powerful tool for the analysis of dynamic data. In an ESN, the input signal is fed to a fixed (possibly large) pool of interconnected neurons, whose state is then read by an adaptable layer to provide the output. This last layer is generally trained via a regularized linear least-squares procedure. In this paper, we consider the more complex problem of training an ESN for classification problems in a semi-supervised setting, wherein only a part of the input sequences are effectively labeled with the desired response. To solve the problem, we combine the standard ESN with a semi-supervised support vector machine (S3VM) for training its adaptable connections. Additionally, we propose a novel algorithm for solving the resulting non-convex optimization problem, hinging on a series of successive approximations of the original problem. The resulting procedure is highly customizable and also admits a principled way of parallelizing training over multiple processors/computers. An extensive set of experimental evaluations on audio classification tasks supports the presented semi-supervised ESN as a practical tool for dynamic problems requiring the analysis of partially labeled data.

Explore More