Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Leonardo Badino is active.

Publication


Featured researches published by Leonardo Badino.


Neuropsychologia | 2014

Sensorimotor communication in professional quartets

Leonardo Badino; Alessandro D'Ausilio; Donald Glowinski; Antonio Camurri; Luciano Fadiga

Non-verbal group dynamics are often opaque to a formal quantitative analysis of communication flow. In this context, ensemble musicians can be a reliable model of expert group coordination. In fact, bodily motion is a critical component of inter-musician coordination and thus could be used as a valuable index of sensorimotor communication. Here we measured head movement kinematics of an expert quartet of musicians and, by applying Granger Causality analysis, we numerically described the causality patterns between participants. We found a clear positive relationship between the amount of communication and complexity of the score segment. Furthermore, we also applied temporal and dynamical changes to the musical score, known by the first violin only. The perturbations were devised in order to force unidirectional communication between the leader of the quartet and the other participants. Results show that in these situations, unidirectional influence from the leader decreased, thus implying that effective leadership may require prior sharing of information between participants. In conclusion, we could measure the amount of information flow and sensorimotor group dynamics suggesting that the fabric of leadership is not built upon exclusive information knowledge but rather on sharing it.


international conference on acoustics, speech, and signal processing | 2014

AN AUTO-ENCODER BASED APPROACH TO UNSUPERVISED LEARNING OF SUBWORD UNITS

Leonardo Badino; Claudia Canevari; Luciano Fadiga; Giorgio Metta

In this paper we propose an auto encoder-based method for the unsupervised identification of subword units. We experiment with different types and architectures of auto encoders to assess what auto encoder properties are most important for this task. We first show that the encoded representation of speech produced by standard auto encoders is more effective than Gaussian posteriorgrams in a spoken query classification task. Finally we evaluate the subword inventories produced by the proposed method both in terms of classification accuracy in a word classification task (with lexicon size up to 263 words) and in terms of consistency between subword transcription of different word examples of a same word type. The evaluation is carried out on Italian and American English datasets.


spoken language technology workshop | 2012

Deep-level acoustic-to-articulatory mapping for DBN-HMM based phone recognition

Leonardo Badino; Claudia Canevari; Luciano Fadiga; Giorgio Metta

In this paper we experiment with methods based on Deep Belief Networks (DBNs) to recover measured articulatory data from speech acoustics. Our acoustic-to-articulatory mapping (AAM) processes go through multi-layered and hierarchical (i.e., deep) representations of the acoustic and the articulatory domains obtained through unsupervised learning of DBNs. The unsupervised learning of DBNs can serve two purposes: (i) pre-training of the Multi-layer Perceptrons that perform AAM; (ii) transformation of the articulatory domain that is recovered from acoustics through AAM. The recovered articulatory features are combined with MFCCs to compute phone posteriors for phone recognition. Tested on the MOCHA-TIMIT corpus, the recovered articulatory features, when combined with MFCCs, lead to up to a remarkable 16.6% relative phone error reduction w.r.t. a phone recognizer that only uses MFCCs.


PLOS ONE | 2011

The use of phonetic motor invariants can improve automatic phoneme discrimination.

Claudio Castellini; Leonardo Badino; Giorgio Metta; Giulio Sandini; Michele Tavella; Mirko Grimaldi; Luciano Fadiga

We investigate the use of phonetic motor invariants (MIs), that is, recurring kinematic patterns of the human phonetic articulators, to improve automatic phoneme discrimination. Using a multi-subject database of synchronized speech and lips/tongue trajectories, we first identify MIs commonly associated with bilabial and dental consonants, and use them to simultaneously segment speech and motor signals. We then build a simple neural network-based regression schema (called Audio-Motor Map, AMM) mapping audio features of these segments to the corresponding MIs. Extensive experimental results show that a small set of features extracted from the MIs, as originally gathered from articulatory sensors, are dramatically more effective than a large, state-of-the-art set of audio features, in automatically discriminating bilabials from dentals; the same features, extracted from AMM-reconstructed MIs, are as effective as or better than the audio features, when testing across speakers and coarticulating phonemes; and dramatically better as noise is added to the speech signal. These results seem to support some of the claims of the motor theory of speech perception and add experimental evidence of the actual usefulness of MIs in the more general framework of automated speech recognition.


Computer Speech & Language | 2016

Integrating articulatory data in deep neural network-based acoustic modeling

Leonardo Badino; Claudia Canevari; Luciano Fadiga; Giorgio Metta

HighlightsWe test strategies to exploit articulatory data in DNN-HMM phone recognition.Autoencoder-transformed articulatory features produce the best results.Pre-training of phone classifier DNNs driven by acoustic-to-articulatory mapping.Utility of articulatory information in noisy conditions and in cross-speaker settings. Hybrid deep neural network-hidden Markov model (DNN-HMM) systems have become the state-of-the-art in automatic speech recognition. In this paper we experiment with DNN-HMM phone recognition systems that use measured articulatory information. Deep neural networks are both used to compute phone posterior probabilities and to perform acoustic-to-articulatory mapping (AAM). The AAM processes we propose are based on deep representations of the acoustic and the articulatory domains. Such representations allow to: (i) create different pre-training configurations of the DNNs that perform AAM; (ii) perform AAM on a transformed (through DNN autoencoders) articulatory feature (AF) space that captures strong statistical dependencies between articulators. Traditionally, neural networks that approximate the AAM are used to generate AFs that are appended to the observation vector of the speech recognition system. Here we also study a novel approach (AAM-based pretraining) where a DNN performing the AAM is instead used to pretrain the DNN that computes the phone posteriors. Evaluations on both the MOCHA-TIMIT msak0 and the mngu0 datasets show that: (i) the recovered AFs reduce phone error rate (PER) in both clean and noisy speech conditions, with a maximum 10.1% relative phone error reduction in clean speech conditions obtained when autoencoder-transformed AFs are used; (ii) AAM-based pretraining could be a viable strategy to exploit the available small articulatory datasets to improve acoustic models trained on large acoustic-only datasets.


spoken language technology workshop | 2008

Automatic labeling of contrastive word pairs from spontaneous spoken english

Leonardo Badino; Robert A. J. Clark

This paper addresses the problem of automatically labeling contrast in spontaneous spoken speech, where contrast here is meant as a relation that ties two words that explicitly contrast with each other. Detection of contrast is certainly relevant in the analysis of discourse and information structure and also, because of the prosodic correlates of contrast, could play an important role in speech applications, such as text-to-speech synthesis, that need an accurate and discourse context related modeling of prosody. With this prospect we investigate the feasibility of automatic contrast labeling by training and evaluating on the Switchboard corpus a novel contrast tagger, based on support vector machines (SVM), that combines lexical features, syntactic dependencies and WordNet semantic relations.


Cognitive Processing | 2015

Automatic imitation of the arm kinematic profile in interacting partners

Alessandro D’Ausilio; Leonardo Badino; Pietro Cipresso; Alice Chirico; Elisabetta Ferrari; Giuseppe Riva; Andrea Gaggioli

Abstract Cognitive neuroscience, traditionally focused on individual brains, is just beginning to investigate social cognition through realistic interpersonal interaction. However, quantitative investigation of the dynamical sensorimotor communication among interacting individuals in goal-directed ecological tasks is particularly challenging. Here, we recorded upper-body motion capture of 23 dyads, alternating their leader/follower role, in a tower-building task. Either a strategy of joining efforts or a strategy of independent action could in principle be used. We found that arm reach velocity profiles of participants tended to converge across trials. Automatic imitation of low-level motor control parameters demonstrates that the task is achieved through continuous action coordination as opposed to independent action planning. Moreover, the leader produced more consistent and predictable velocity profiles, suggesting an implicit strategy of signaling to the follower. This study serves as a validation of our joint goal-directed non-verbal task for future applications. In fact, the quantification of human-to-human continuous sensorimotor interaction, in a way that can be predicted and controlled, is probably one of the greatest challenges for the future of human–robot interaction.


conference of the international speech communication association | 2016

Phonetic Context Embeddings for DNN-HMM Phone Recognition.

Leonardo Badino

This paper proposes an approach, named phonetic context embedding, to model phonetic context effects for deep neural network hidden Markov model (DNN-HMM) phone recognition. Phonetic context embeddings can be regarded as continuous and distributed vector representations of context-dependent phonetic units (e.g., triphones). In this work they are computed using neural networks. First, all phone labels are mapped into vectors of binary distinctive features (DFs, e.g., nasal/notnasal). Then for each speech frame the corresponding DF vector is concatenated with DF vectors of previous and next frames and fed into a neural network that is trained to estimate the acoustic coefficients (e.g., MFCCs) of that frame. The values of the first hidden layer represent the embedding of the input DF vectors. Finally, the resulting embeddings are used as secondary task targets in a multi-task learning (MTL) setting when training the DNN that computes phone state posteriors. The approach allows to easily encode a much larger context than alternative MTL-based approaches. Results on TIMIT with a fully connected DNN shows phone error rate (PER) reductions from 22.4% to 21.0% and from 21.3% to 19.8% on the test core and the validation set respectively and lower PER than an alternative strong MTL approach.


arts and technology | 2013

Towards Automated Analysis of Joint Music Performance in the Orchestra

Giorgio Gnecco; Leonardo Badino; Antonio Camurri; Alessandro D’Ausilio; Luciano Fadiga; Donald Glowinski; Marcello Sanguineti; Giovanna Varni; Gualtiero Volpe

Preliminary results from a study of expressivity and of non-verbal social signals in small groups of users are presented. Music is selected as experimental test-bed since it is a clear example of interactive and social activity, where affective non-verbal communication plays a fundamental role. In this experiment the orchestra is adopted as a social group characterized by a clear leader (the conductor) of two groups of musicians (the first and second violin sections). It is shown how a reduced set of simple movement features - heads movements - can be sufficient to explain the difference in the behavior of the first violin section between two performance conditions, characterized by different eye contact between the two violin sections and between the first section and the conductor.


IEEE Transactions on Cognitive and Developmental Systems | 2017

Multilevel Behavioral Synchronization in a Joint Tower-Building Task

Moreno I. Coco; Leonardo Badino; Pietro Cipresso; Alice Chirico; Elisabetta Ferrari; Giuseppe Riva; Andrea Gaggioli; Alessandro D'Ausilio

Human to human sensorimotor interaction can only be fully understood by modeling the patterns of bodily synchronization and reconstructing the underlying mechanisms of optimal cooperation. We designed a tower-building task to address such a goal. We recorded upper body kinematics of dyads and focused on the velocity profiles of the head and wrist. We applied recurrence quantification analysis to examine the dynamics of synchronization within, and across the experimental trials, to compare the roles of leader and follower. Our results show that the leader was more auto-recurrent than the follower to make his/her behavior more predictable. When looking at the cross-recurrence of the dyad, we find different patterns of synchronization for head and wrist motion. On the wrist, dyads synchronized at short lags, and such a pattern was weakly modulated within trials, and invariant across them. Head motion, instead, synchronized at longer lags and increased both within and between trials: a phenomenon mostly driven by the leader. Our findings point at a multilevel nature of human to human sensorimotor synchronization, and may provide an experimentally solid benchmark to identify the basic primitives of motion, which maximize behavioral coupling between humans and artificial agents.

Collaboration


Dive into the Leonardo Badino's collaboration.

Top Co-Authors

Avatar

Luciano Fadiga

Istituto Italiano di Tecnologia

View shared research outputs
Top Co-Authors

Avatar

Alessandro D'Ausilio

Istituto Italiano di Tecnologia

View shared research outputs
Top Co-Authors

Avatar

Giorgio Metta

Istituto Italiano di Tecnologia

View shared research outputs
Top Co-Authors

Avatar

Claudia Canevari

Istituto Italiano di Tecnologia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Noël Nguyen

Aix-Marseille University

View shared research outputs
Top Co-Authors

Avatar

Alessandro D’Ausilio

Istituto Italiano di Tecnologia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge