Sharon Gannot
Bar-Ilan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sharon Gannot.
IEEE Transactions on Signal Processing | 2001
Sharon Gannot; David Burshtein; D. Ehud Weinstein
We consider a sensor array located in an enclosure, where arbitrary transfer functions (TFs) relate the source signal and the sensors. The array is used for enhancing a signal contaminated by interference. Constrained minimum power adaptive beamforming, which has been suggested by Frost (1972) and, in particular, the generalized sidelobe canceler (GSC) version, which has been developed by Griffiths and Jim (1982), are the most widely used beamforming techniques. These methods rely on the assumption that the received signals are simple delayed versions of the source signal. The good interference suppression attained under this assumption is severely impaired in complicated acoustic environments, where arbitrary TFs may be encountered. In this paper, we consider the arbitrary TF case. We propose a GSC solution, which is adapted to the general TF case. We derive a suboptimal algorithm that can be implemented by estimating the TFs ratios, instead of estimating the TFs. The TF ratios are estimated by exploiting the nonstationarity characteristics of the desired signal. The algorithm is applied to the problem of speech enhancement in a reverberating room. The discussion is supported by an experimental study using speech and noise signals recorded in an actual room acoustics environment.
IEEE Transactions on Speech and Audio Processing | 1998
Sharon Gannot; David Burshtein; Ehud Weinstein
Speech quality and intelligibility might significantly deteriorate in the presence of background noise, especially when the speech signal is subject to subsequent processing. In particular, speech coders and automatic speech recognition (ASR) systems that were designed or trained to act on clean speech signals might be rendered useless in the presence of background noise. Speech enhancement algorithms have therefore attracted a great deal of interest. In this paper, we present a class of Kalman filter-based algorithms with some extensions, modifications, and improvements of previous work. The first algorithm employs the estimate-maximize (EM) method to iteratively estimate the spectral parameters of the speech and noise parameters. The enhanced speech signal is obtained as a byproduct of the parameter estimation algorithm. The second algorithm is a sequential, computationally efficient, gradient descent algorithm. We discuss various topics concerning the practical implementation of these algorithms. Extensive experimental study using real speech and noise signals is provided to compare these algorithms with alternative speech enhancement algorithms, and to compare the performance of the iterative and sequential algorithms.
IEEE Transactions on Audio, Speech, and Language Processing | 2009
Shmulik Markovich; Sharon Gannot; Israel Cohen
In many practical environments we wish to extract several desired speech signals, which are contaminated by nonstationary and stationary interfering signals. The desired signals may also be subject to distortion imposed by the acoustic room impulse responses (RIRs). In this paper, a linearly constrained minimum variance (LCMV) beamformer is designed for extracting the desired signals from multimicrophone measurements. The beamformer satisfies two sets of linear constraints. One set is dedicated to maintaining the desired signals, while the other set is chosen to mitigate both the stationary and nonstationary interferences. Unlike classical beamformers, which approximate the RIRs as delay-only filters, we take into account the entire RIR [or its respective acoustic transfer function (ATF)]. The LCMV beamformer is then reformulated in a generalized sidelobe canceler (GSC) structure, consisting of a fixed beamformer (FBF), blocking matrix (BM), and adaptive noise canceler (ANC). It is shown that for spatially white noise field, the beamformer reduces to a FBF, satisfying the constraint sets, without power minimization. It is shown that the application of the adaptive ANC contributes to interference reduction, but only when the constraint sets are not completely satisfied. We show that relative transfer functions (RTFs), which relate the desired speech sources and the microphones, and a basis for the interference subspace suffice for constructing the beamformer. The RTFs are estimated by applying the generalized eigenvalue decomposition (GEVD) procedure to the power spectral density (PSD) matrices of the received signals and the stationary noise. A basis for the interference subspace is estimated by collecting eigenvectors, calculated in segments where nonstationary interfering sources are active and the desired sources are inactive. The rank of the basis is then reduced by the application of the orthogonal triangular decomposition (QRD). This procedure relaxes the common requirement for nonoverlapping activity periods of the interference sources. A comprehensive experimental study in both simulated and real environments demonstrates the performance of the proposed beamformer.
EURASIP Journal on Advances in Signal Processing | 2003
Sharon Gannot; Marc Moonen
A novel approach for multimicrophone speech dereverberation is presented. The method is based on the construction of the null subspace of the data matrix in the presence of colored noise, using the generalized singular-value decomposition (GSVD) technique, or the generalized eigenvalue decomposition (GEVD) of the respective correlation matrices. The special Silvester structure of the filtering matrix, related to this subspace, is exploited for deriving a total least squares (TLS) estimate for the acoustical transfer functions (ATFs). Other less robust but computationally more efficient methods are derived based on the same structure and on the QR decomposition (QRD). A preliminary study of the incorporation of the subspace method into a subband framework proves to be efficient, although some problems remain open. Speech reconstruction is achieved by virtue of the matched filter beamformer (MFBF). An experimental study supports the potential of the proposed methods.
IEEE Transactions on Speech and Audio Processing | 2004
Sharon Gannot; Israel Cohen
In speech enhancement applications microphone array postfiltering allows additional reduction of noise components at a beamformer output. Among microphone array structures the recently proposed general transfer function generalized sidelobe canceller (TF-GSC) has shown impressive noise reduction abilities in a directional noise field, while still maintaining low speech distortion. However, in a diffused noise field less significant noise reduction is obtainable. The performance is even further degraded when the noise signal is nonstationary. In this contribution we propose three postfiltering methods for improving the performance of microphone arrays. Two of which are based on single-channel speech enhancers and making use of recently proposed algorithms concatenated to the beamformer output. The third is a multichannel speech enhancer which exploits noise-only components constructed within the TF-GSC structure. This work concentrates on the assessment of the proposed postfiltering structures. An extensive experimental study, which consists of both objective and subjective evaluation in various noise fields, demonstrates the advantage of the multichannel postfiltering compared to the single-channel techniques.
IEEE Signal Processing Letters | 2009
Emanuël A. P. Habets; Sharon Gannot; Israel Cohen
In speech communication systems the received microphone signals are degraded by room reverberation and ambient noise that decrease the fidelity and intelligibility of the desired speaker. Reverberant speech can be separated into two components, viz. early speech and late reverberant speech. Recently, various algorithms have been developed to suppress late reverberant speech. One of the main challenges is to develop an estimator for the so-called late reverberant spectral variance (LRSV) which is required by most of these algorithms. In this letter a statistical reverberation model is proposed that takes the energy contribution of the direct-path into account. This model is then used to derive a more general LRSV estimator, which in a particular case reduces to an existing LRSV estimator. Experimental results show that the developed estimator is advantageous in case the source-microphone distance is smaller than the critical distance.
IEEE Transactions on Speech and Audio Processing | 2002
David Burshtein; Sharon Gannot
We present a spectral domain, speech enhancement algorithm. The new algorithm is based on a mixture model for the short time spectrum of the clean speech signal, and on a maximum assumption in the production of the noisy speech spectrum. In the past this model was used in the context of noise robust speech recognition. In this paper we show that this model is also effective for improving the quality of speech signals corrupted by additive noise. The computational requirements of the algorithm can be significantly reduced, essentially without paying performance penalties, by incorporating a dual codebook scheme with tied variances. Experiments, using recorded speech signals and actual noise sources, show that in spite of its low computational requirements, the algorithm shows improved performance compared to alternative speech enhancement algorithms.
IEEE Transactions on Audio, Speech, and Language Processing | 2010
Emanuel A. P. Habets; Jacob Benesty; Israel Cohen; Sharon Gannot; Jacek Dmochowski
The minimum variance distortionless response (MVDR) beamformer, also known as Capons beamformer, is widely studied in the area of speech enhancement. The MVDR beamformer can be used for both speech dereverberation and noise reduction. This paper provides new insights into the MVDR beamformer. Specifically, the local and global behavior of the MVDR beamformer is analyzed and novel forms of the MVDR filter are derived and discussed. In earlier works it was observed that there is a tradeoff between the amount of speech dereverberation and noise reduction when the MVDR beamformer is used. Here, the tradeoff between speech dereverberation and noise reduction is analyzed thoroughly. The local and global behavior, as well as the tradeoff, is analyzed for different noise fields such as, for example, a mixture of coherent and non-coherent noise fields, entirely non-coherent noise fields and diffuse noise fields. It is shown that maximum noise reduction is achieved when the MVDR beamformer is used for noise reduction only. The amount of noise reduction that is sacrificed when complete dereverberation is required depends on the direct-to-reverberation ratio of the acoustic impulse response between the source and the reference microphone. The performance evaluation supports the theoretical analysis and demonstrates the tradeoff between speech dereverberation and noise reduction. When desiring both speech dereverberation and noise reduction, the results also demonstrate that the amount of noise reduction that is sacrificed decreases when the number of microphones increases.
Archive | 2008
Sharon Gannot; Israel Cohen
In this chapter, we explore many of the basic concepts of array processing with an emphasis on adaptive beamforming for speech enhancement applications. We begin in Sect. 47.1 by formulating the problem of microphone array in a noisy and reverberant environment. In Sect. 47.2, we derive the frequency-domain linearly constrained minimum-variance (LCMV) beamformer, and its generalized sidelobe canceller (GSC) variant. The GSC components are explored in Sect. 47.3, and several commonly used special cases of these blocks are presented. As the GSC structure necessitates an estimation of the speech related acoustical transfer functions (ATFs), several alternative system identification methods are addressed in Sect. 47.4. Beamformers often suffer from sensitivity to signal mismatch. We analyze this phenomenon in Sect. 47.5 and explore several cures to this problem. Although the GSC beamformer yields a significant improvement in speech quality, when the noise field is spatially incoherent or diffuse, the noise reduction is insufficient and additional postfiltering is normally required. In Sect. 47.6, we present multi-microphone postfilters, based on either minimum mean-squared error (MMSE) or log-spectral amplitude estimate criteria. An interesting relation between the GSC and the Wiener filter is derived in this Section as well. In Sect. 47.7, we analyze the performance of the transfer-function GSC (TF-GSC), and in Sect. 47.8 demonstrate the advantage of multichannel postfiltering over single-channel postfiltering in nonstationary noise conditions.
Journal of the Acoustical Society of America | 2007
Emanuel A. P. Habets; Sharon Gannot
Researchers in the signal processing community often require sensor signals that result from a spherically or cylindrically isotropic noise field for simulation purposes. Although it has been shown that these signals can be generated using a number of uncorrelated noise sources that are uniformly spaced on a sphere or cylinder, this method is seldom used in practice. In this paper algorithms that generate sensor signals of an arbitrary one- and three-dimensional array that result from a spherically or cylindrically isotropic noise field are developed. Furthermore, the influence of the number of noise sources on the accuracy of the generated sensor signals is investigated.