Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mehrez Souden is active.

Publication


Featured researches published by Mehrez Souden.


IEEE Transactions on Signal Processing | 2010

A Study of the LCMV and MVDR Noise Reduction Filters

Mehrez Souden; Jacob Benesty; Sofiène Affes

In real-world environments, the signals captured by a set of microphones in a speech communication system are mixtures of the desired signal, interference, and ambient noise. A promising solution for proper speech acquisition (with reduced noise and interference) in this context consists in using the linearly constrained minimum variance (LCMV) beamformer to reject the interference, reduce the overall mixture energy, and preserve the target signal. The minimum variance distortionless response beamformer (MVDR) is also commonly known to reduce the interference-plus-noise energy without distorting the desired signal. In either case, it is of paramount importance to accurately quantify the achieved noise and interference reduction. Indeed, it is quite reasonable to ask, for instance, about the price that has to be paid in order to achieve total removal of the interference without distorting the target signal when using the LCMV. Besides, it is fundamental to understand the effect of the MVDR on both noise and interference. In this correspondence, we investigate the performance of the MVDR and LCMV beamformers when the interference and ambient noise coexist with the target source. We demonstrate a new relationship between both filters in which the MVDR is decomposed into the LCMV and a matched filter (MVDR solution in the absence of interference). Both components are properly weighted to achieve maximum interference-plus-noise reduction. We investigate the performance of the MVDR, LCMV, and matched filters and elaborate new closed-form expressions for their output signal-to-interference ratio (SIR) and output signal-to-noise ratio (SNR). We theoretically demonstrate the tradeoff that has to be made between noise reduction and interference rejection. In fact, the total removal of the interference may severely amplify the residual ambient noise. Conversely, totally focussing on noise reduction leads to increased level of residual interference. The proposed study is finally supported by several numerical examples.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

An Integrated Solution for Online Multichannel Noise Tracking and Reduction

Mehrez Souden; Jingdong Chen; Jacob Benesty; Sofiène Affes

Noise statistics estimation is a paramount issue in the design of reliable noise-reduction algorithms. Although significant efforts have been devoted to this problem in the literature, most developed methods so far have focused on the single-channel case. When multiple microphones are used, it is important that the data from all the sensors are optimally combined to achieve judicious updates of the noise statistics and the noise-reduction filter. This contribution is devoted to the development of a practical approach to multichannel noise tracking and reduction. We combine the multichannel speech presence probability (MC-SPP) that we proposed in an earlier contribution with an alternative formulation of the minima-controlled recursive averaging (MCRA) technique that we generalize from the single-channel to the multichannel case. To demonstrate the effectiveness of the proposed MC-SPP and multichannel noise estimator, we integrate them into three variants of the multichannel noise reduction Wiener filter. Experimental results show the advantages of the proposed solution.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Gaussian Model-Based Multichannel Speech Presence Probability

Mehrez Souden; Jingdong Chen; Jacob Benesty; Sofiène Affes

The knowledge of the target speech presence probability in a mixture of signals captured by a speech communication system is of paramount importance in several applications including reliable noise reduction algorithms. In this correspondence, we establish a new expression for speech presence probability when an array of microphones with an arbitrary geometry is used. Our study is based on the assumption of the Gaussian statistical model for all signals and involves the noise and noisy data statistics only. In comparison with the single-channel case, the new proposed multichannel approach can significantly increase the detection accuracy. In particular, when the additive noise is spatially coherent, perfect speech presence detection is theoretically possible, while when the noise is spatially white, a coherent summation of speech components is performed to allow for enhanced speech presence probability estimation.


IEEE Transactions on Signal Processing | 2008

A Two-Stage Approach to Estimate the Angles of Arrival and the Angular Spreads of Locally Scattered Sources

Mehrez Souden; Sofiène Affes; Jacob Benesty

We propose a new two-stage approach to estimate the nominal angles of arrival (AoAs) and the angular spreads (ASs) of multiple locally scattered sources using a uniform linear array (ULA) of sensors. In contrast to earlier works, we consider both long- and short-term channel variations, typically encountered in wireless links. In the first stage, we exploit sources independence to blindly estimate the channel over several data blocks regularly spaced by intervals larger than the coherence time but each, short enough in length, to make time variations negligible within the block duration. We, thereby, decouple the multisource channel parameters estimation problem in hand into parallel and independent single-source channel parameters estimation subproblems. In the second stage, for each spatially scattered source, we process the corresponding sequence of quasi-independent channel realization estimates as a new single-scattered-source observation over which we apply Taylor series expansions to transform the estimation of the nominal AoA and the AS of the corresponding scattered source into a simple localization of two closely spaced, equi-powered, and uncorrelated rays (i.e., point sources). To localize both rays, we propose new accurate and computationally simple closed-form expressions for the mean value of the spatial harmonics and their separation by means of covariance fitting. An asymptotic performance analysis is also provided to prove the efficiency of the proposed estimators. Then, the AS and the nominal AoA of every source are directly deduced. The whole proposed framework takes advantage of the capabilities of the preprocessing channel identification step (to reduce the noise effect and decouple the estimation of the channel parameters of every source from the others) and the new simple and accurate closed-form estimators to accurately retrieve the channel parameters even in the most adverse conditions, mainly low signal-to-noise ratio (SNR), few sensors, no prior knowledge of the angular distribution, and closely spaced sources, as supported by simulations.


IEEE Transactions on Signal Processing | 2009

Robust Doppler Spread Estimation in the Presence of a Residual Carrier Frequency Offset

Mehrez Souden; Sofiène Affes; Jacob Benesty; Rim Bahroun

In high data-rate transmission systems, accurate Doppler spread estimation is a critical task for not only mobile velocity estimation, but also for optimal adaptive processing. It is known that the residual carrier frequency offset (CFO) which is inherent to the asynchrony between the communicating ends in a wireless link has a detrimental effect on the Doppler spread estimation. In this correspondence, we propose a new simple and accurate approach that copes with this issue by explicitly taking the CFO into account when estimating the Doppler spread. This new approach stems from the fact that the cross-correlation of the channel is a weighted summation of monochromatic plane waves (or an inverse Fourier transform of its power spectral density). It turns out that these plane waves are locally (as compared to the sampling rate) distributed around a main frequency which is nothing but the CFO. Using this property, we base our analysis on Taylor series expansions in addition to an observation temporal aperture to develop a two-ray spectrum approximate model for the Doppler spread estimation. We find that the Doppler spread is half of the frequency spacing between both rays which are located symmetrically around the CFO. Finally, we deduce new closed-form estimators for the Doppler spread and also for the CFO. These estimators are accurate and practical in environments with isotropic scattering where the channel power spectrum density (PSD) is symmetric. Simulations are provided to illustrate the advantages of the proposed method and its robustness to the CFO.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Broadband Source Localization From an Eigenanalysis Perspective

Mehrez Souden; Jacob Benesty; Sofiène Affes

Broadband source localization has several applications ranging from automatic video camera steering to target signal tracking and enhancement through beamforming. Consequently, there has been a considerable amount of effort to develop reliable methods for accurate localization over the last few decades. Essentially, the localization process consists in finding the candidate source location that maximizes the synchrony between the properly time-shifted microphone outputs. In addition to using well known cross-correlation-based criteria such as the steered response power (SRP), minimum variance (MV), and multichannel cross-correlation (MCCC), this synchrony can also be measured using the averaged magnitude difference function (AMDF) and the averaged magnitude sum function (AMSF) whose calculations involve low computational cost. In earlier related works, the latter techniques have been used for time delay estimation (TDE) of a target source observed by only one pair of microphones. Their generalization to the multiple microphone case and application to source localization have not been studied yet. In this paper, we consider both categories, i.e., cross-correlation and AMDF (with AMSF)-based approaches, using an arbitrary number of microphones, and analyze their performance. Specifically, we first provide a unifying study of the most popular cross-correlation-based techniques, such as the SRP, MV, and MCCC. In this paper, we use the eigenanalysis of the parameterized spatial correlation matrix (PSCM) to classify these methods and gain some insight into their performance. We demonstrate, for instance, that the MV and SRP consist in searching the major eigenvalue of the PSCM, while the MCCC, essentially, combines its minor eigenvalues when scanning for the source location. Inspired by this analysis, we show, in the second part of this work, the efficiency of the AMDF and AMSF in localizing an acoustic source using multiple microphones. Indeed, we propose two new parameterized matrices named as the parameterized averaged magnitude difference matrix (PAMDM) and the parameterized averaged magnitude sum matrix (PAMSM). The eigenanalysis of these matrices also reveals new criteria for acoustic source localization. Simulation results are provided to illustrate the effectiveness of all the investigated and proposed methods.


Computer Speech & Language | 2013

Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds

Marc Delcroix; Keisuke Kinoshita; Tomohiro Nakatani; Shoko Araki; Atsunori Ogawa; Takaaki Hori; Shinji Watanabe; Masakiyo Fujimoto; Takuya Yoshioka; Takanobu Oba; Yotaro Kubo; Mehrez Souden; Seong-Jun Hahm; Atsushi Nakamura

Research on noise robust speech recognition has mainly focused on dealing with relatively stationary noise that may differ from the noise conditions in most living environments. In this paper, we introduce a recognition system that can recognize speech in the presence of multiple rapidly time-varying noise sources as found in a typical family living room. To deal with such severe noise conditions, our recognition system exploits all available information about speech and noise; that is spatial (directional), spectral and temporal information. This is realized with a model-based speech enhancement pre-processor, which consists of two complementary elements, a multi-channel speech-noise separation method that exploits spatial and spectral information, followed by a single channel enhancement algorithm that uses the long-term temporal characteristics of speech obtained from clean speech examples. Moreover, to compensate for any mismatch that may remain between the enhanced speech and the acoustic model, our system employs an adaptation technique that combines conventional maximum likelihood linear regression with the dynamic adaptive compensation of the variance of the Gaussians of the acoustic model. Our proposed system approaches human performance levels by greatly improving the audible quality of speech and substantially improving the keyword recognition accuracy.


IEEE Transactions on Audio, Speech, and Language Processing | 2012

A Perspective on Differential Microphone Arrays in the Context of Noise Reduction

Jacob Benesty; Mehrez Souden; Yiteng Huang

In this correspondence, we study the performance of differential microphone arrays (DMAs) in terms of noise reduction, speech distortion, and signal-to-noise ratio (SNR) gain. We also investigate their beampatterns and array gains. We start by establishing the expressions of these performance measures involving general derivatives of the channel transfer functions. Afterwards, we specify our results in the case of anechoic near-field and far-field propagation models.


IEEE Signal Processing Letters | 2010

On the Global Output SNR of the Parameterized Frequency-Domain Multichannel Noise Reduction Wiener Filter

Mehrez Souden; Jacob Benesty; Sofiène Affes

The parameterized multichannel Wiener filter (PMWF) is known to allow for a flexible tuning of speech distortion and noise reduction. In addition, the output signal-to-noise ratio (SNR) is a natural metric that shows the true effect of such a filter on both residual noise and filtered speech. Earlier contributions have only shown that the output SNR of the Wiener filter is larger than the input SNR (as a proof of its effectiveness). However, the effect of the tuning parameter (denoted as ß following the notation of [1]) on this metric is not yet understood. In this paper, we prove that the global (fullband) output SNR of the PMWF is an increasing function of ß but remains below an asymptotic value. As a byproduct, a very simplified proof of the global output SNR improvement is provided.


IEEE Signal Processing Letters | 2012

Noise Power Spectral Density Tracking: A Maximum Likelihood Perspective

Mehrez Souden; Marc Delcroix; Keisuke Kinoshita; Takuya Yoshioka; Tomohiro Nakatani

We propose a new approach for online noise power spectral density (psd) tracking. In this approach, the prior and posterior probabilities of speech absence and also noise statistics are analytically retrieved from a maximum-likelihood-based criterion at every time-frequency slot. The recursive update rules of these three terms are performed in a unified manner and without relying on the conventional tracking of speech psd minima. A single parameter (a forgetting factor) is needed in this process. Comparisons with state of the art methods demonstrate the effectiveness of our proposal.

Collaboration


Dive into the Mehrez Souden's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sofiène Affes

Institut national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Tomohiro Nakatani

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar

Keisuke Kinoshita

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar

Marc Delcroix

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar

Shoko Araki

Nippon Telegraph and Telephone

View shared research outputs
Top Co-Authors

Avatar

Jingdong Chen

Northwestern Polytechnical University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hiroshi Sawada

Nippon Telegraph and Telephone

View shared research outputs
Researchain Logo
Decentralizing Knowledge