Paolo Frasconi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paolo Frasconi is active.

Explore More

Publication

Featured researches published by Paolo Frasconi.

IEEE Transactions on Neural Networks | 1994

Learning long-term dependencies with gradient descent is difficult

Yoshua Bengio; Patrice Y. Simard; Paolo Frasconi

Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered.

IEEE Transactions on Neural Networks | 1998

A general framework for adaptive processing of data structures

Paolo Frasconi; Marco Gori; Alessandro Sperduti

A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for representing this class of adaptive transductions by means of recursive networks, i.e., cyclic graphs where nodes are labeled by variables and edges are labeled by generalized delay elements. This representation makes it possible to incorporate the symbolic and subsymbolic nature of data. Structures are processed by unfolding the recursive network into an acyclic graph called encoding network. In so doing, inference and learning algorithms can be easily inherited from the corresponding algorithms for artificial neural networks or probabilistic graphical model.

IEEE Transactions on Neural Networks | 1996

Input-output HMMs for sequence processing

Yoshua Bengio; Paolo Frasconi

We consider problems of sequence processing and propose a solution based on a discrete-state model in order to represent past context. We introduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call input-output hidden Markov model (IOHMM). It can be trained by the estimation-maximization (EM) or generalized EM (GEM) algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.

Nucleic Acids Research | 2006

DISULFIND: a disulfide bonding state and cysteine connectivity prediction server

Alessio Ceroni; Andrea Passerini; Alessandro Vullo; Paolo Frasconi

DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the assigned bonding state (with confidence degrees) and the most likely connectivity patterns. The server is available at .

Neural Computation | 1992

Local feedback multilayered networks

Paolo Frasconi; Marco Gori; Giovanni Soda

In this paper, we investigate the capabilities of local feedback multilayered networks, a particular class of recurrent networks, in which feedback connections are only allowed from neurons to themselves. In this class, learning can be accomplished by an algorithm that is local in both space and time. We describe the limits and properties of these networks and give some insights on their use for solving practical problems.

IEEE Transactions on Neural Networks | 1995

Learning without local minima in radial basis function networks

Monica Bianchini; Paolo Frasconi; Marco Gori

Learning from examples plays a central role in artificial neural networks. The success of many learning schemes is not guaranteed, however, since algorithms like backpropagation may get stuck in local minima, thus providing suboptimal solutions. For feedforward networks, optimal learning can be achieved provided that certain conditions on the network and the learning environment are met. This principle is investigated for the case of networks using radial basis functions (RBF). It is assumed that the patterns of the learning environment are separable by hyperspheres. In that case, we prove that the attached cost function is local minima free with respect to all the weights. This provides us with some theoretical foundations for a massive application of RBF in pattern recognition.

Bioinformatics | 2004

Disulfide connectivity prediction using recursive neural networks and evolutionary information

Alessandro Vullo; Paolo Frasconi

MOTIVATION We focus on the prediction of disulfide bridges in proteins starting from their amino acid sequence and from the knowledge of the disulfide bonding state of each cysteine. The location of disulfide bridges is a structural feature that conveys important information about the protein main chain conformation and can therefore help towards the solution of the folding problem. Existing approaches based on weighted graph matching algorithms do not take advantage of evolutionary information. Recursive neural networks (RNN), on the other hand, can handle in a natural way complex data structures such as graphs whose vertices are labeled by real vectors, allowing us to incorporate multiple alignment profiles in the graphical representation of disulfide connectivity patterns. RESULTS The core of the method is the use of machine learning tools to rank alternative disulfide connectivity patterns. We develop an ad-hoc RNN architecture for scoring labeled undirected graphs that represent connectivity patterns. In order to compare our algorithm with previous methods, we report experimental results on the SWISS-PROT 39 dataset. We find that using multiple alignment profiles allows us to obtain significant prediction accuracy improvements, clearly demonstrating the important role played by evolutionary information. AVAILABILITY The Web interface of the predictor is available at http://neural.dsi.unifi.it/cysteines

IEEE Transactions on Neural Networks | 2004

New results on error correcting output codes of kernel machines

Andrea Passerini; Massimiliano Pontil; Paolo Frasconi

We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.

IEEE Transactions on Intelligent Transportation Systems | 2013

Short-Term Traffic Flow Forecasting: An Experimental Comparison of Time-Series Analysis and Supervised Learning

Marco Lippi; Matteo Bertini; Paolo Frasconi

The literature on short-term traffic flow forecasting has undergone great development recently. Many works, describing a wide variety of different approaches, which very often share similar features and ideas, have been published. However, publications presenting new prediction algorithms usually employ different settings, data sets, and performance measurements, making it difficult to infer a clear picture of the advantages and limitations of each model. The aim of this paper is twofold. First, we review existing approaches to short-term traffic flow forecasting methods under the common view of probabilistic graphical models, presenting an extensive experimental comparison, which proposes a common baseline for their performance analysis and provides the infrastructure to operate on a publicly available data set. Second, we present two new support vector regression models, which are specifically devised to benefit from typical traffic flow seasonality and are shown to represent an interesting compromise between prediction accuracy and computational efficiency. The SARIMA model coupled with a Kalman filter is the most accurate model; however, the proposed seasonal support vector regressor turns out to be highly competitive when performing forecasts during the most congested periods.

Pattern Recognition | 2003

Combining flat and structured representations for fingerprint classification with recursive neural networks and support vector machines

Yuan Yao; Gian Luca Marcialis; Massimiliano Pontil; Paolo Frasconi; Fabio Roli

We present new fingerprint classification algorithms based on two machine learning approaches: support vector machines (SVMs) and recursive neural networks (RNNs). RNNs are trained on a structured representation of the fingerprint image. They are also used to extract a set of distributed features of the fingerprint which can be integrated in the SVM. SVMs are combined with a new error-correcting code scheme. This approach has two main advantages: (a) It can tolerate the presence of ambiguous fingerprint images in the training set and (b) it can effectively identify the most difficult fingerprint images in the test set. By rejecting these images the accuracy of the system improves significantly. We report experiments on the fingerprint database NIST-4. Our best classification accuracy is of 95.6 percent at 20 percent rejection rate and is obtained by training SVMs on both FingerCode and RNN-extracted features. This result indicates the benefit of integrating global and structured representations and suggests that SVMs are a promising approach for fingerprint classification.

Explore More