Yuanhang Zheng
University of Erlangen-Nuremberg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuanhang Zheng.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Anthony Lombard; Yuanhang Zheng; Herbert Buchner; Walter Kellermann
In this paper, we show that minimization of the statistical dependence using broadband independent component analysis (ICA) can be successfully exploited for acoustic source localization. As the ICA signal model inherently accounts for the presence of several sources and multiple sound propagation paths, the ICA criterion offers a theoretically more rigorous framework than conventional techniques based on an idealized single-path and single-source signal model. This leads to algorithms which outperform other localization methods, especially in the presence of multiple simultaneously active sound sources and under adverse conditions, notably in reverberant environments. Three methods are investigated to extract the time difference of arrival (TDOA) information contained in the filters of a two-channel broadband ICA scheme. While for the first, the blind system identification (BSI) approach, the number of sources should be restricted to the number of sensors, the other methods, the averaged directivity pattern (ADP) and composite mapped filter (CMF) approaches can be used even when the number of sources exceeds the number of sensors. To allow fast tracking of moving sources, the ICA algorithm operates in block-wise batch mode, with a proportionate weighting of the natural gradient to speed up the convergence of the algorithm. The TDOA estimation accuracy of the proposed schemes is assessed in highly noisy and reverberant environments for two, three, and four stationary noise sources with speech-weighted spectral envelopes as well as for moving real speech sources.
ieee international workshop on computational advances in multi sensor adaptive processing | 2009
Yuanhang Zheng; Klaus Reindl; Walter Kellermann
Blind Source Extraction (BSE) as desirable for acoustic cocktail party scenarios requires estimates for the target or interfering signals. Conventional single-channel approaches for obtaining the interference estimate rely on noise and interference estimates during absence of the target signal. For multichannel approaches using multiple microphone signals, a separation of simultaneously active target and interference signals becomes possible if the positions of the target and interfering sources are known. We propose a new method which exploits Directional BSS (Blind Source Separation with a geometric constraint) to estimate the interfering speech sources and diffuse background noise jointly and blindly. Herewith we can effectively deal with the underdetermined BSS scenario (more point sources than sensors) in reverberant environments and can even allow for additional babble noise in the background.
international symposium on communications control and signal processing | 2010
Klaus Reindl; Yuanhang Zheng; Walter Kellermann
The availability of wireless technologies leads from monaural or bilateral hearing aids to binaural processing strategies. In this paper, we investigate a class of blind source separation (BSS)-based speech enhancement algorithms for binaural hearing aids. The blind binaural processing strategies are analyzed and evaluated for different scenarios, i.e., determined scenarios, where the number of sources does not exceed the number of available sensors and underdetermined scenarios, where there are more active source signals than microphones which is typical for hearing aid applications. These blind algorithms are an attractive alternative to beamforming as no a-priori knowledge on the sensor positions is required. Moreover, BSS algorithms have the advantage that their optimization criteria are solely based on the fundamental assumption of mutual statistical independence of the different source signals.
EURASIP Journal on Advances in Signal Processing | 2014
Yuanhang Zheng; Klaus Reindl; Walter Kellermann
For speech enhancement or blind signal extraction (BSE), estimating interference and noise characteristics is decisive for its performance. For multichannel approaches using multiple microphone signals, a BSE scheme combining a blocking matrix (BM) and spectral enhancement filters was proposed in numerous publications. For such schemes, the BM provides a noise estimate by suppressing the target signal only. The estimated noise reference is then used to design spectral enhancement filters for the purpose of noise reduction. For designing the BM, ‘Directional Blind Source Separation (BSS)’ was already proposed earlier. This method combines a generic BSS algorithm with a geometric constraint derived from prior information on the target source position to obtain an estimate for all interfering point sources and diffuse background noise. In this paper, we provide a theoretical analysis to show that Directional BSS converges to a relative transfer function (RTF)-based BM. The behavior of this informed signal separation scheme is analyzed and the blocking performance of Directional BSS under various acoustical conditions is evaluated. The robustness of Directional BSS regarding the localization error for the target source position is verified by experiments. Finally, a BSE scheme combining Directional BSS and Wiener-type spectral enhancement filters is described and evaluated.
2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays | 2011
Yuanhang Zheng; Anthony Lombard; Walter Kellermann
In rapidly time-varying acoustic scenarios, Blind Source Separation (BSS) often suffers from problems with slow convergence and poor separation performance. In this paper, we propose an improved scheme composed of Directional BSS and a source localizer for robustly separating and quickly tracking sources in a rapidly time-varying scene. Directional BSS is defined as a generic BSS algorithm combined with geometric constraints which are roughly equivalent to a set of delay&subtract beamformers. This scheme combines benefits from BSS and null beamforming and achieves high performance for separating sources of known locations. For estimating the Time-Difference-Of-Arrival (TDOA) of each source, another BSS system running in parallel to the main Directional BSS is exploited to serve as a source localizer. Experimental results illustrate the efficiency of the proposed concept.
asilomar conference on signals, systems and computers | 2010
Klaus Reindl; Yuanhang Zheng; Anthony Lombard; Andreas Schwarz; Walter Kellermann
In this contribution, an acoustic front-end for distant-talking interfaces as developed within the European Union-funded project DICIT (Distant-talking interfaces for Control of Interactive TV) is presented. It comprises state-of-the-art multichannel acoustic echo cancellation and blind source separation-based signal extraction and only requires two microphone signals. The proposed scheme is analyzed and evaluated for different realistic scenarios when a speech recognizer is used as back-end. The results show that the system significantly outperforms simple alternatives, i.e., a two-channel Delay & Sum beamformer for speech signal extraction.
international conference on acoustics, speech, and signal processing | 2011
Anthony Lombard; Yuanhang Zheng; Walter Kellermann
In this paper, minimization of the statistical dependence is exploited for acoustic source localization purposes. Originally developed for the separation of signal mixtures, we show that Independent Component Analysis (ICA) can also be successfully applied to localize multiple simultaneously active sound sources, with possibly less sensors than sources. First, the recently proposed Averaged Directivity Pattern (ADP) and State Coherence Transform (SCT) methods are reviewed. Similarities and differences between both approaches are underlined and analyzed, leading to a new method merging elements from both concepts, which we call the Modified ADP (MADP). Since the investigated methods do not suffer from the permutation ambiguity, they can be applied in combination with any narrowband or broadband ICA algorithm, without the need to solve the still challenging permutation issue. Experimental results are presented for speech sources in a reverberant environment.
international conference on signal processing | 2012
Klaus Reindl; Yuanhang Zheng; Stefan Meier; Andreas Schwarz; Walter Kellermann
In this contribution, a two-channel acoustic front-end for robust automatic speech recognition (ASR) in adverse acoustic environments is analyzed. The source signal extraction scheme combines a blocking matrix based on semi-blind source separation, which provides a continuously updated reference of all undesired components separated from the desired signal and its reflections, and a single-channel Wiener postfilter. The postfilter is directly derived from the obtained noise and interference reference signal and hence, generalizes well-known postfilter realizations. The proposed front-end and its integration into an ASR system are analyzed and evaluated with respect to keyword accuracy under reverberant conditions with unpredictable and nonstationary interferences, and for different target source distances. Evaluating a simplified front-end based on free-field assumptions, an ideal front-end, where knowledge about the true undesired components is assumed, and comparing the proposed scheme with the competitive approach of solely using multistyle training, demonstrates the importance of an adequate signal preprocessing for robust distant speech recognition.
Computer Speech & Language | 2013
Klaus Reindl; Yuanhang Zheng; Andreas Schwarz; Stefan Meier; Roland Maas; Armin Sehr; Walter Kellermann
Archive | 2011
Roland Maas; Andreas Schwarz; Yuanhang Zheng; Klaus Reindl; Stefan Meier; Armin Sehr; Walter Kellermann