Edmondo Trentin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Edmondo Trentin is active.

Explore More

Publication

Featured researches published by Edmondo Trentin.

Neurocomputing | 2001

A survey of hybrid ANN/HMM models for automatic speech recognition

Edmondo Trentin; Marco Gori

Abstract In spite of the advances accomplished throughout the last decades, automatic speech recognition (ASR) is still a challenging and difficult task. In particular, recognition systems based on hidden Markov models (HMMs) are effective under many circumstances, but do suffer from some major limitations that limit applicability of ASR technology in real-world environments. Attempts were made to overcome these limitations with the adoption of artificial neural networks (ANN) as an alternative paradigm for ASR, but ANN were unsuccessful in dealing with long time-sequences of speech signals. Between the end of the 1980s and the beginning of the 1990s, some researchers began exploring a new research area, by combining HMMs and ANNs within a single, hybrid architecture. The goal in hybrid systems for ASR is to take advantage from the properties of both HMMs and ANNs, improving flexibility and recognition performance. A variety of different architectures and novel training algorithms have been proposed in literature. This paper reviews a number of significant hybrid models for ASR, putting together approaches and techniques from a highly specialistic and non-homogeneous literature. Efforts concentrate on describing and referencing architectures and algorithms, their advantages and limitations, as well as on categorizing them into broad classes. Early attempts to emulate HMMs by ANNs are first described. Then we focus on ANNs to estimate posterior probabilities of the states of an HMM and on “global” optimization, where a single, overall training criterion is defined over the HMM and the ANNs. Connectionist vector quantization for discrete HMMs, and other more recent approaches are also reviewed. It is pointed out that, in addition to their theoretical interest, hybrid systems have been allowing for tangible improvements in recognition performance over the standard HMMs in difficult and significant benchmark tasks.

Neural Networks | 2001

Networks with trainable amplitude of activation functions

Edmondo Trentin

Network training algorithms have heavily concentrated on the learning of connection weights. Little effort has been made to learn the amplitude of activation functions, which defines the range of values that the function can take. This paper introduces novel algorithms to learn the amplitudes of nonlinear activations in layered networks, without any assumption on their analytical form. Three instances of the algorithms are developed: (i) a common amplitude is shared among all nonlinear units; (ii) each layer has its own amplitude; and (iii) neuron-specific amplitudes are allowed. The algorithms can also be seen as a particular double-step gradient-descent procedure, as gradient-driven adaptive learning rate schemes, or as weight-grouping techniques that are consistent with known scaling laws for regularization with weight decay. As a side effect, a self-pruning mechanism of redundant neurons may emerge. Experimental results on function approximation, classification, and regression tasks, with synthetic and real-world data, validate the approach and show that the algorithms speed up convergence and modify the search path in the weight space, possibly reaching deeper minima that may also improve generalization.

Pattern Recognition Letters | 2014

Pattern classification and clustering: A review of partially supervised learning approaches

Friedhelm Schwenker; Edmondo Trentin

The paper categorizes and reviews the state-of-the-art approaches to the partially supervised learning (PSL) task. Special emphasis is put on the fields of pattern recognition and clustering involving partially (or, weakly) labeled data sets. The major instances of PSL techniques are categorized into the following taxonomy: (i) active learning for training set design, where the learning algorithm has control over the training data; (ii) learning from fuzzy labels, whenever multiple and discordant human experts are involved in the (complex) data labeling process; (iii) semi-supervised learning (SSL) in pattern classification (further sorted out into: self-training, SSL with generative models, semi-supervised support vector machines; SSL with graphs); (iv) SSL in data clustering, using additional constraints to incorporate expert knowledge into the clustering process; (v) PSL in ensembles and learning by disagreement; (vi) PSL in artificial neural networks. In addition to providing the reader with the general background and categorization of the area, the paper aims at pointing out the main issues which are still open, motivating the on-going investigations in PSL research.

Pattern Analysis and Applications | 2015

Techniques for dealing with incomplete data: a tutorial and survey

Marco Aste; Massimo Boninsegna; Antonino Freno; Edmondo Trentin

Real-world applications of pattern recognition, or machine learning algorithms, often present situations where the data are partly missing, corrupted by noise, or otherwise incomplete. In spite of that, developments in the machine learning community in the last decade have mostly focused on mathematical analysis of learning machines, making it difficult for practitioners to recollect an overview of major approaches to this issue. Paradoxically, as a consequence, even established methodologies rooted in statistics appear to have long been forgotten. Although the relevant literature on the topic is so wide that no exhaustive coverage is nowadays possible, the first goal of this paper is to provide the reader with a nonetheless significant survey of major, or utterly sound, techniques for dealing with the tasks of pattern recognition, machine learning, and density estimation from incomplete data. Secondly, the paper aims at representing a viable tutorial tool for the interested practitioner, by allowing for self-contained, step-by-step understanding of several approaches. An effort is made to categorize the different techniques as follows: (1) heuristic methods; (2) statistical approaches; (3) connectionist-oriented techniques; (4) other approaches (dynamical systems, adversarial deletion of features, etc.).

Pattern Recognition Letters | 2014

Combination of supervised and unsupervised learning for training the activation functions of neural networks

Ilaria Castelli; Edmondo Trentin

Standard feedforward neural networks benefit from the nice theoretical properties of mixtures of sigmoid activation functions, but they may fail in several practical learning tasks. These tasks would be better faced by relying on a more appropriate, problem-specific basis of activation functions. The paper presents a connectionist model which exploits adaptive activation functions. Each hidden unit in the network is associated with a specific pair (f(.),p(.)), where f(.) is the activation function and p(.) is the likelihood of the unit being relevant to the computation of the network output over the current input. The function f(.) is optimized in a supervised manner, while p(.) is realized via a statistical parametric model learned through unsupervised (or, partially supervised) estimation. Since f(.) and p(.) influence each others learning process, the overall machine is implicitly a co-trained coupled model and, in turn, a flexible, non-standard neural architecture. Feasibility of the approach is corroborated by empirical evidence yielded by computer simulations involving regression and classification tasks.

knowledge discovery and data mining | 2009

Scalable pseudo-likelihood estimation in hybrid random fields

Antonino Freno; Edmondo Trentin; Marco Gori

Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such as to prevent state-of-the-art statistical learning techniques from delivering accurate models in reasonable time. This paper presents a hybrid random field model for pseudo-likelihood estimation in high-dimensional domains. A theoretical analysis proves that the class of pseudo-likelihood distributions representable by hybrid random fields strictly includes the class of joint probability distributions representable by Bayesian networks. In order to learn hybrid random fields from data, we develop the Markov Blanket Merging algorithm. Theoretical and experimental evidence shows that Markov Blanket Merging scales up very well to high-dimensional datasets. As compared to other widely used statistical learning techniques, Markov Blanket Merging delivers accurate results in a number of link prediction tasks, while achieving also significant improvements in terms of computational efficiency. Our software implementation of the models investigated in this paper is publicly available at http://www.dii.unisi.it/~freno/. The same website also hosts the datasets used in this work that are not available elsewhere in the same preprocessing used for our experiments.

Pattern Recognition Letters | 2015

Emotion recognition from speech signals via a probabilistic echo-state network

Edmondo Trentin; Stefan Scherer; Friedhelm Schwenker

The paper presents a probabilistic echo-state network (π-ESN) for density estimation over variable-length sequences of multivariate random vectors. The π-ESN stems from the combination of the reservoir of an ESN and a parametric density model based on radial basis functions. A constrained maximum likelihood training algorithm is introduced, suitable for sequence classification. Extensions of the algorithm to unsupervised clustering and semi-supervised learning (SSL) of sequences are proposed. Experiments in emotion recognition from speech signals are conducted on the WaSeP? dataset. Compared with established techniques, the π-ESN yields the highest recognition accuracies, and shows interesting clustering and SSL capabilities.

artificial neural networks in pattern recognition | 2006

Simple and effective connectionist nonparametric estimation of probability density functions

Edmondo Trentin

Estimation of probability density functions (pdf) is one major topic in pattern recognition. Parametric techniques rely on an arbitrary assumption on the form of the underlying, unknown distribution. Nonparametric techniques remove this assumption In particular, the Parzen Window (PW) relies on a combination of local window functions centered in the patterns of a training sample. Although effective, PW suffers from several limitations. Artificial neural networks (ANN) are, in principle, an alternative family of nonparametric models. ANNs are intensively used to estimate probabilities (e.g., class-posterior probabilities), but they have not been exploited so far to estimate pdfs. This paper introduces a simple neural-based algorithm for unsupervised, nonparametric estimation of pdfs, relying on PW. The approach overcomes the limitations of PW, possibly leading to improved pdf models. An experimental demonstration of the behavior of the algorithm w.r.t. PW is presented, using random samples drawn from a standard exponential pdf.

international symposium on neural networks | 2009

Scalable statistical learning: A modular bayesian/markov network approach

Antonino Freno; Edmondo Trentin; Marco Gori

In this paper we propose a hybrid probabilistic graphical model for pseudo-likelihood estimation in high-dimensional domains. The model is based on Bayesian networks and Markov random fields. On the one hand, we prove that the proposed model is more expressive than Bayesian networks in terms of the representable distributions. On the other hand, we develop a computationally efficient structure learning algorithm, and we provide theoretical and experimental evidence showing how the modular nature of our model allows structure learning to scale up very well to high-dimensional datasets. The capability of the hybrid model to accurately learn complex networks of conditional independencies is illustrated by promising results in pattern recognition applications.

Neurocomputing | 2006

Inversion-based nonlinear adaptation of noisy acoustic parameters for a neural/HMM speech recognizer

Edmondo Trentin; Marco Gori

Abstract Spoken human–machine interaction in real-world environments requires acoustic models that are robust to changes in acoustic conditions, e.g. presence of noise. Unfortunately, the popular hidden Markov models (HMM) are not noise tolerant. One way to increase recognition performance is to acquire a small adaptation set of noisy utterances, which is used to estimate a normalization mapping between noisy and clean features to be fed into the acoustic model. This paper proposes an unsupervised maximum-likelihood gradient-ascent training algorithm (instead of the usual least squares regression) for a neural feature adaptation module, properly combined with a hybrid connectionist/HMM speech recognizer. The algorithm is inspired by the so-called “inversion principle”, that prescribes the optimization of the input features instead of the model parameters. Simulation results on a real-world speaker-independent continuous speech corpus of connected Italian digits, corrupted by noise, validate the approach. A small neural net (13 hidden neurons) trained over a single adaptation utterance for one iteration yields a 18.79% relative word error rate (WER) reduction over the bare hybrid, and a 65.10% relative WER reduction over the Gaussian-based HMM.

Explore More