Francesco Nesta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francesco Nesta is active.

Explore More

Publication

Featured researches published by Francesco Nesta.

IEEE Transactions on Audio, Speech, and Language Processing | 2011

Convolutive BSS of Short Mixtures by ICA Recursively Regularized Across Frequencies

Francesco Nesta; Piergiorgio Svaizer; Maurizio Omologo

This paper proposes a new method of frequency-domain blind source separation (FD-BSS), able to separate acoustic sources in challenging conditions. In frequency-domain BSS, the time-domain signals are transformed into time-frequency series and the separation is generally performed by applying independent component analysis (ICA) at each frequency envelope. When short signals are observed and long demixing filters are required, the number of time observations for each frequency is limited and the variance of the ICA estimator increases due to the intrinsic statistical bias. Furthermore, common methods used to solve the permutation problem fail, especially with sources recorded under highly reverberant conditions. We propose a recursively regularized implementation of the ICA (RR-ICA) that overcomes the mentioned problem by exploiting two types of deterministic knowledge: 1) continuity of the demixing matrix across frequencies; 2) continuity of the time-activity of the sources. The recursive regularization propagates the statistics of the sources across frequencies reducing the effect of statistical bias and the occurrence of permutations. Experimental results on real-data show that the algorithm can successfully perform a fast separation of short signals (e.g., 0.5-1s), by estimating long demixing filters to deal with highly reverberant environments (e.g., ms).

international conference on latent variable analysis and signal separation | 2012

The 2011 signal separation evaluation campaign (SiSEC2011): - audio source separation -

Shoko Araki; Francesco Nesta; Emmanuel Vincent; Zbyněk Koldovský; Guido Nolte; Andreas Ziehe; Alexis Benichoux

This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011). Four speech and music datasets were contributed, including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets. The participants addressed one or more tasks out of four source separation tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.

IEEE Transactions on Audio, Speech, and Language Processing | 2012

Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources

Francesco Nesta; Maurizio Omologo

According to the physical meaning of the frequency-domain blind source separation (FD-BSS), each mixing matrix estimated by independent component analysis (ICA) contains information on the physical acoustic propagation related to each source and then can be used for localization purposes. In this paper, we analyze the Generalized State Coherence Transform (GSCT) which is a non-linear transform of the space represented by the whole demixing matrices. The transform enables an accurate estimation of the propagation time-delay of multiple sources in multiple dimensions. Furthermore, it is shown that with appropriate nonlinearities and a statistical model for the reverberation, GSCT can be considered an approximated kernel density estimator of the acoustic propagation time-delay. Experimental results confirm the good properties of the transform and its effectiveness in addressing multiple source TDOA detection (e.g., 2-D TDOA estimation of several sources with only three microphones).

IEEE Transactions on Audio, Speech, and Language Processing | 2011

Batch-Online Semi-Blind Source Separation Applied to Multi-Channel Acoustic Echo Cancellation

Francesco Nesta; Ted S. Wada; Biing-Hwang Juang

Semi-blind source separation (SBSS) is a special case of the well-known blind source separation (BSS) when some partial knowledge of the source signals is available to the system. In particular, a batch adaptation in the frequency domain based on independent component analysis (ICA) can be effectively used to jointly perform source separation and multichannel acoustic echo cancellation (MCAEC) through SBSS without double-talk detection. Many issues related to the implementation of an SBSS system are discussed in this paper. After a deep analysis of the structure of the SBSS adaptation, we propose a constrained batch-online implementation that stabilizes the convergence behavior even in the worst case scenario of a single far-end talker along with the non-uniqueness condition on the far-end mixing system. Specifically, a matrix constraint is proposed to reduce the effect of the non-uniqueness problem caused by highly correlated far-end reference signals during MCAEC. Experimental results show that high echo cancellation can be achieved just as the misalignment remains relatively low without any preprocessing procedure to decorrelate the far-end signals even for the single far-end talker case.

international conference on latent variable analysis and signal separation | 2012

Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation

Francesco Nesta; Maurizio Omologo

This paper presents a novel method for underdetermined acoustic source separation of convolutive mixtures. Multiple complex-valued Independent Component Analysis adaptations jointly estimate the mixing matrix and the temporal activities of multiple sources in each frequency. A structure based on a recursive temporal weighting of the gradient enforces each ICA adaptation to estimate mixing parameters related to sources having a disjoint temporal activity. Permutation problem is reduced imposing a multiresolution spatio-temporal correlation of the narrow-band components. Finally, aligned mixing parameters are used to recover the sources through L0 -norm minimization and a post-processing based on a single channel Wiener filtering. Promising results obtained over a public dataset show that the proposed method is an effective solution to the underdetermined source separation problem.

Computer Speech & Language | 2013

Blind source extraction for robust speech recognition in multisource noisy environments

Francesco Nesta; Marco Matassoni

This paper proposes and describes a complete system for Blind Source Extraction (BSE). The goal is to extract a target signal source in order to recognize spoken commands uttered in reverberant and noisy environments, and acquired by a microphone array. The architecture of the BSE system is based on multiple stages: (a) TDOA estimation, (b) mixing system identification for the target source, (c) on-line semi-blind source separation and (d) source extraction. All the stages are effectively combined, allowing the estimation of the target signal with limited distortion. While a generalization of the BSE framework is described, here the proposed system is evaluated on the data provided for the CHiME Pascal 2011 competition, i.e. binaural recordings made in a real-world domestic environment. The CHiME mixtures are processed with the BSE and the recovered target signal is fed to a recognizer, which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, acoustic model adaptation is applied to further reduce the mismatch between training and testing data and improve the overall performance. A detailed comparison between different models and algorithmic settings is reported, showing that the approach is promising and the resulting system gives a significant reduction of the error rate.

IEEE Transactions on Audio, Speech, and Language Processing | 2013

Semi-Blind Noise Extraction Using Partially Known Position of the Target Source

Zbynek Koldovsky; Jiri Malek; Petr Tichavsky; Francesco Nesta

An extracted noise signal provides important information for subsequent enhancement of a target signal. When the targets position is fixed, the noise extractor could be a target-cancellation filter derived in a noise-free situation. In this paper we consider a situation when such cancellation filters are prepared for a set of several possible positions of the target in advance. The set of filters is interpreted as prior information available for the noise extraction when the targets exact position is unknown. Our novel method looks for a linear combination of the prepared filters via Independent Component Analysis. The method yields a filter that has a better cancellation performance than the individual filters or filters based on a minimum variance principle. The method is tested in a highly noisy and reverberant real-world environment with moving target source and interferers. A post-processing by Wiener filter using the noise signal extracted by the method is able to improve signal-to-noise ratio of the target by up to 8 dB.

international conference on acoustics, speech, and signal processing | 2010

Cooperative Wiener-ICA for source localization and Separation by distributed microphone arrays

Francesco Nesta; Maurizio Omologo

During the last decade, distributed microphone arrays have been proposed in order to increase accuracy and spatial coverage of speaker localization systems operating in large and reverberant rooms. In principle, the framework provided by a distributed microphone network can also be applied effectively when using Blind Source Separation (BSS). Separation is commonly performed by processing the signals sampled at closely spaced microphones in a single adaptation step, for example by means of Independent Component Analysis (ICA). When the microphone spacing or the distance between source and microphones increase, the separation performance reduces due to spatial aliasing effects and to a reduced spatial coherence at microphones. In this paper we propose a new method, here referred to as Cooperative Wiener ICA (CW-ICA), which is able to apply BSS to signals acquired by a network of distributed microphone arrays. Different ICA adaptations are applied to the signals recorded by each array and are interconnected in order to constrain each adaptation to converge to a solution related to the same physical interpretation. A preliminary analysis on a network of two arrays shows that the proposed method can be applied successfully to source separation and localization tasks.

international conference on acoustics, speech, and signal processing | 2012

Enhanced multidimensional spatial functions for unambiguous localization of multiple sparse acoustic sources

Francesco Nesta; Maurizio Omologo

The Steered Response Power with PHAT transform (SRP-PHAT) or Global Coherence Field (GCF), has become a standard method for acoustic source localization, thanks to their simplicity, computational inexpensiveness and robustness against mid-high reverberation. However, originally formulated for the single source localization case, it does not apply satisfactorily to the multiple source case. In this paper, we analyze the structure of the spatial function and reshape it according to a generic multidimensional metric. We show that traditional functions are based on the L1 norm which is prone to generate ambiguous locations with high likelihood (i.e. ghosts). A more generic multidimensional kernel based on higher norms and on a partitioned representation of the cross-power spectrum is introduced, which better exploits the source sparseness in the discrete time-frequency domain. Evaluation results over simulated data show that the new spatial functions considerably improve the detection of multiple competing sources in both spatial and multidimensional TDOA domains.

international workshop on machine learning for signal processing | 2008

Multiple TDOA estimation by using a state coherence transform for solving the permutation problem in frequency-domain BSS

Francesco Nesta; Maurizio Omologo; Piergiorgio Svaizer

A novel method to solve the permutation problem for Blind Source Separation (BSS) is presented. According to the acoustic propagation model, in frequency-domain, each separation matrix can be represented with a set of states associated with each source. We formulate a novel transform of the states which is independent of the aliasing and of the permutations since states belonging to all the sources are exploited at the same time. The estimated TDOAs are used to model the propagation of the acoustic wave and to cluster all the frequency components associated to the same source. Experimental results show that the novel approach can be applied to localize and separate sources in challenging situations: two sources have been separated estimating long demixing filters (0.25-0.5s) using widely spaced microphones (0.25 m) in reverberant environment (T60 = 700 ms) and using very short signals (0.5-1 s).

Explore More