Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rudy Rotili is active.

Publication


Featured researches published by Rudy Rotili.


Journal of Electrical and Computer Engineering | 2010

Comparative evaluation of single-channel MMSE-Based noise reduction schemes for speech recognition

Simone Cifani; Rudy Rotili; Stefano Squartini; Francesco Piazza

One of the big challenges in the field of Automatic Speech Recognition (ASR) consists in developing suitable solutions able to work properly also in adverse acoustic conditions, like in presence of additive noise and/or in reverberant rooms. Recently a certain attention has been paid to deeply integrate the noise suppressor in the feature extraction pipeline. In this paper, different single-channel MMSE-based noise reduction schemes have been implemented both in the frequency and cepstral domains and the related recognition performances evaluated on the AURORA2 and AURORA4 databases, therefore providing a useful reference for the scientific community.


international conference on intelligent computing | 2010

Joint Multichannel Blind Speech Separation and Dereverberation: A Real-Time Algorithmic Implementation

Rudy Rotili; Claudio De Simone; Alessandro Perelli; Simone Cifani; Stefano Squartini

Blind source separation (BSS) and dereverberation have been deeply investigated due to their importance in many applications, as in image and audio processing. A two-stage approach leading to a sequential source separation and speech dereverberation algorithm based on blind channel identification (BCI) has recently appeared in literature and taken here as reference. In this contribution, a real-time implementation of the aforementioned approach is presented. The optimum inverse filtering algorithm based on the Bezout’s Theorem and used in the dereverberation stage has been substituted with an iterative technique, which is computationally more efficient and allows the inversion of long impulse responses in real-time applications. The entire framework works in frequency domain and the NU-Tech software platform has been used on purpose for real-time simulations.


asia pacific conference on circuits and systems | 2008

A robust iterative inverse filtering approach for speech dereverberation in presence of disturbances

Rudy Rotili; Simone Cifani; Stefano Squartini; Francesco Piazza

In the present work the inverse filtering problem for speech dereverberation in stationary conditions is addressed. In particular we consider the presence of multiple observables which has a beneficial impact of on room transfer functions (RTFs) invertibility. In actual acoustic environments the assumed knowledge of RTFs is usually altered by the presence of disturbances under the form of additive noise or RTF fluctuations, inevitably resulting in reduced inverse filtering performances. Several approaches, mainly based on regularization theory, have appeared in the literature to face such a problem. Among them, a recent study has shown the dereverberation capabilities dependence on some design parameters, significantly related to the filter energy. In this paper such interesting work is taken as reference and its optimum inverse filtering approach substituted with an iterative technique, which is typically much more computationally efficient. As proved by results obtained through the several computer simulations carried out, such an algorithm has revealed to be more robust w.r.t. the reference counterpart in terms of regularization parameter variations.


Neurocomputing | 2012

Environmental robust speech and speaker recognition through multi-channel histogram equalization

Stefano Squartini; Rudy Rotili; Francesco Piazza

Feature statistics normalization in the cepstral domain is one of the most performing approaches for robust automaticspeech and speaker recognition in noisy acoustic scenarios: feature coefficients are normalized by using suitable linear or nonlinear transformations in order to match the noisy speech statistics to the clean speech one. Histogram equalization (HEQ) belongs to such a category of algorithms and has proved to be effective on purpose and therefore taken here as reference. In this paper the presence of multi-channel acoustic channels is used to enhance the statistics modeling capabilities of the HEQ algorithm, by exploiting the availability of multiple noisy speech occurrences, with the aim of maximizing the effectiveness of the cepstra normalization process. Computer simulations based on the Aurora 2 database in speech and speaker recognition scenarios have shown that a significant recognition improvement with respect to the single-channel counterpart and other multi-channel techniques can be achieved confirming the effectiveness of the idea. The proposed algorithmic configuration has also been combined with the kernel estimation technique in order to further improve the speech recognition performances.


Cognitive Computation | 2013

A Real-Time Speech Enhancement Framework in Noisy and Reverberated Acoustic Scenarios

Rudy Rotili; Stefano Squartini; Björn W. Schuller

This paper deals with speech enhancement in noisy reverberated environments where multiple speakers are active. The authors propose an advanced real-time speech processing front-end aimed at automatically reducing the distortions introduced by room reverberation in distant speech signals, also considering the presence of background noise, and thus to achieve a significant improvement in speech quality for each speaker. The overall framework is composed of three cooperating blocks, each one fulfilling a specific task: speaker diarization, room impulse responses identification and speech dereverberation. In particular, the speaker diarization algorithm pilots the operations performed in the other two algorithmic stages, which have been suitably designed and parametrized to operate with noisy speech observations. Extensive computer simulations have been performed by using a subset of the AMI database under different realistic noisy and reverberated conditions. Obtained results show the effectiveness of the approach.


international conference on intelligent computing | 2011

Real-Time speech recognition in a multi-talker reverberated acoustic scenario

Rudy Rotili; Stefano Squartini; Björn W. Schuller

This paper proposes a real-time algorithmic framework for Automatic Speech Recognition (ASR) in presence of multiple sources in reverberated environment. The addressed real-life acoustic scenario definitely asks for a robust signal processing solution to reduce the impact of source mixing and reverberation on ASR performances. Here the authors show how the implemented approach allows to improve recognition accuracies under real-time processing constraints and overlapping distant-talking speakers. A suitable database has been generated on purpose, by adapting an existing large vocabulary continuous speech recognition (LVCSR) corpus to deal with the acoustic conditions under study.


international symposium on circuits and systems | 2010

Robust speech recognition using feature-domain multi-channel bayesian estimators

Rudy Rotili; Simone Cifani; Lorenzo Marinelli; Stefano Squartini; Francesco Piazza

This paper proposes innovative multi-channel bayesian estimators in the feature-domain for robust speech recognition. Both minimum-mean-squared-error (MMSE) and maximum-a-posteriori (MAP) criteria have been explored: the related algorithms extend the multi-channel frequency-domain counterparts and generalize the single-channel feature-domain MMSE solution, recently appeared in the literature. Computer simulations conducted on a modified AURORA2 database show the efficacy of the frequency-domain multi-channel estimators when used as a pre-processing stage of a speech recognition engine, and that the proposed multi-channel MAP approach outperforms single-channel estimators by at least 3 % on average.


Cognitive Computation | 2012

Real-Time Activity Detection in a Multi-Talker Reverberated Environment

Rudy Rotili; Martin Wöllmer; Florian Eyben; Stefano Squartini; Björn W. Schuller

This paper proposes a real-time person activity detection framework operating in presence of multiple sources in reverberated environments. Such a framework is composed by two main parts: The speech enhancement front-end and the activity detector. The aim of the former is to automatically reduce the distortions introduced by room reverberation in the available distant speech signals and thus to achieve a significant improvement of speech quality for each speaker. The overall front-end is composed by three cooperating blocks, each one fulfilling a specific task: Speaker diarization, room impulse responses identification, and speech dereverberation. In particular, the speaker diarization algorithm is essential to pilot the operations performed in the other two stages in accordance with speakers’ activity in the room. The activity estimation algorithm is based on bidirectional Long Short-Term Memory networks which allow for context-sensitive activity classification from audio feature functionals extracted via the real-time speech feature extraction toolkit openSMILE. Extensive computer simulations have been performed by using a subset of the AMI database for activity evaluation in meetings: Obtained results confirm the effectiveness of the approach.


Archive | 2011

Multi-channel Feature Enhancement for Robust Speech Recognition

Rudy Rotili; Simone Cifani; Francesco Piazza; Stefano Squartini

In the last decades, a great deal of research has been devoted to extending our capacity of verbal communication with computers through automatic speech recognition (ASR). Although optimum performance can be reached when the speech signal is captured close to the speaker’s mouth, there are still obstacles to overcome in making reliable distant speech recognition (DSR) systems. The two major sources of degradation in DSR are distortions, such as additive noise and reverberation. This implies that speech enhancement techniques are typically required to achieve best possible signal quality. Different methodologies have been proposed in literature for environment robustness in speech recognition over the past two decades (Gong (1995); Hussain, Chetouani, Squartini, Bastari & Piazza (2007)). Two main classes can be identified (Li et al. (2009)). The first class encompasses the so called model-based techniques, which operate on the acoustic model to adapt or adjust its parameters so that the system fits better the distorted environment. The most popular of such techniques are multi-style training (Lippmann et al. (2003)), parallel model combination (PMC) (Gales & Young (2002)) and the vector Taylor series (VTS) model adaptation (Moreno (1996)). Although model-based techniques obtain excellent results, they require heavy modifications to the decoding stage and, in most cases, a greater computational burden. Conversely, the second class directly enhances the speech signal before it is presented to the recognizer, and show some significant advantages with respect to the previous class:


2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis | 2009

A PEM-AFROW based algorithm for acoustic feedback control in automotive speech reinforcement systems

Simone Cifani; L. C. Montesi; Rudy Rotili; Stefano Squartini; Francesco Piazza

Developing performing speech reinforcement systems to improve the intra-cabin communication quality among car passengers in different row seats, typically degraded by the distance between speakers (for instance in SUV and mini-van) and the noise presence within the cockpit, has represented a challenging issue within the related scientific community. One of the main problem to solve in this scenario is the reduction of the electroacoustic coupling between louds and mics in order to avoid the system reaching instability (howling), namely acoustic feedback cancellation (AFC). One of the most performing technique for AFC is the PEM-AFROW approach, recently appeared in the literature. In this work, we propose an innovative feedback suppressor scheme based on the PEM-AFROW concept, which allows to achieve a valuable balance of feedback reduction, maximum stable gain values and overall sound quality. Results obtained by computer simulations in the dual channel communication scenario confirm the effectiveness of the idea.

Collaboration


Dive into the Rudy Rotili's collaboration.

Top Co-Authors

Avatar

Stefano Squartini

Marche Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Francesco Piazza

Marche Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Simone Cifani

Marche Polytechnic University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cesare Rocchi

Marche Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

L. C. Montesi

Marche Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Lorenzo Marinelli

Marche Polytechnic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge