Ina Kodrasi
University of Oldenburg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ina Kodrasi.
IEEE Transactions on Audio, Speech, and Language Processing | 2013
Ina Kodrasi; Stefan Goetze; Simon Doclo
Acoustic multichannel equalization techniques such as the multiple-input/output inverse theorem (MINT), which aim to equalize the room impulse responses (RIRs) between the source and the microphone array, are known to be highly sensitive to RIR estimation errors. To increase robustness, it has been proposed to incorporate regularization in order to decrease the energy of the equalization filters. In addition, more robust partial multichannel equalization techniques such as relaxed multichannel least-squares (RMCLS) and channel shortening (CS) have recently been proposed. In this paper, we propose a partial multichannel equalization technique based on MINT (P-MINT) which aims to shorten the RIR. Furthermore, we investigate the effectiveness of incorporating regularization to further increase the robustness of P-MINT and the aforementioned partial multichannel equalization techniques, i.e., RMCLS and CS. In addition, we introduce an automatic non-intrusive procedure for determining the regularization parameter based on the L-curve. Simulation results using measured RIRs show that incorporating regularization in P-MINT yields a significant performance improvement in the presence of RIR estimation errors, whereas a smaller performance improvement is observed when incorporating regularization in RMCLS and CS. Furthermore, it is shown that the intrusively regularized P-MINT technique outperforms all other investigated intrusively regularized multichannel equalization techniques in terms of perceptual speech quality (PESQ). Finally, it is shown that the automatic non-intrusive regularization parameter in regularized P-MINT leads to a very similar performance as the intrusively determined optimal regularization parameter, making regularized P-MINT a robust, perceptually advantageous, and practically applicable multichannel equalization technique for speech dereverberation.
international conference on acoustics, speech, and signal processing | 2012
Ina Kodrasi; Simon Doclo
This paper presents a novel approach for partial multichannel equalization using the multiple-input/output inverse theorem with the first part of one of the estimated channels as the target response (P-MINT). In order to further increase the robustness against channel estimation errors, two extensions are proposed, i.e. the incorporation of a regularization parameter in the inverse filter design and a truncated singular value decomposition approach. Experimental results for speech dereverberation show that the regularized P-MINT method outperforms state-of-the-art techniques such as channel shortening and the relaxed multichannel least-squares method in terms of robustness to channel estimation errors.
international conference on acoustics, speech, and signal processing | 2011
Ina Kodrasi; Thomas Rohdenburg; Simon Doclo
The performance of a fixed beamformer highly depends on the position of the microphones in the array. In this paper, different heuristic optimisation approaches for arbitrary planar arrays and an exhaustive search approach for structured array geometries are presented to optimise the microphone positions for a superdirective beamformer, aiming at maximizing the mean directivity index for several steering angles of interest. Through the derivation of an upper bound on the achievable performance, it is shown that the proposed approaches generate configurations with a near-optimal performance. In addition, the theoretical results are validated using real measurements, demonstrating the practical usability of the proposed methods.
international workshop on acoustic signal enhancement | 2014
Stefan Goetze; Anna Warzybok; Ina Kodrasi; Jan Ole Jungmann; Benjamin Cauchi; Jan Rennies; Emanuel A. P. Habets; Alfred Mertins; Timo Gerkmann; Simon Doclo; Birger Kollmeier
This paper reports on the evaluation of several objective quality measures for predicting the quality of the dereverberated speech signals. The correlations between subjective quality assessment for single-channel dereverberation techniques and objective speech quality as well as speech intelligibility measures are analyzed and discussed. Six different single-channel dereverberation algorithms were included in the evaluation to account for different types of distortions. The subjective quality was assessed along the four attributes reverberant, colored, distorted and overall quality following the recommendations of ITU-T P.835. The objective measures included system-based, i.e. channel-based, as well as signal-based measures.
international workshop on acoustic signal enhancement | 2014
Anna Warzybok; Ina Kodrasi; Jan Ole Jungmann; Emanuel A. P. Habets; Timo Gerkmann; Alfred Mertins; Simon Doclo; Birger Kollmeier; Stefan Goetze
In this contribution, six different single-channel dereverberation algorithms are evaluated subjectively in terms of speech intelligibility and speech quality. In order to study the influence of the dereverberation algorithms on speech intelligibility, speech reception thresholds in noise were measured for different reverberation times. The quality ratings were obtained following the ITU-T P.835 recommendations (with slight changes for adaptation to the problem of dere-verberation) and included assessment of the attributes: reverberant, colored, distorted, and overall quality. Most of the algorithms improved speech intelligibility for short as well as long reverberation times compared to the reverberant condition. The best performance in terms of speech intelligibility and quality was observed for the regularized spectral inverse approach with pre-echo removal. The overall quality of the processed signals was highly correlated with the attribute reverberant or/and distorted. To generalize the present outcomes, further studies are needed to account for the influence of the estimation errors.
international conference on acoustics, speech, and signal processing | 2014
Ina Kodrasi; Timo Gerkmann; Simon Doclo
The objective of single-channel inverse filtering is to design an inverse filter that achieves dereverberation while being robust to an inaccurate room impulse response (RIR) measurement or estimate. Since a stable and causal inverse filter typically does not exist, approximate time-domain inverse filtering techniques such as singlechannel least-squares (SCLS) have been proposed. However, besides being computationally expensive and often infeasible, SCLS generally leads to distortions in the output signal in the presence of RIR inaccuracies. In this paper, a theoretical analysis is initially provided, showing that the direct inversion of the acoustic transfer function in the frequency-domain generally yields instability and acausality issues. In order to resolve these issues, a novel frequency-domain inverse filtering technique is proposed that incorporates regularization and uses a single-channel speech enhancement scheme. Experimental results demonstrate that the proposed technique yields a higher dereverberation performance and has a significantly lower computational complexity compared to the SCLS technique.
IEEE Transactions on Audio, Speech, and Language Processing | 2016
Ina Kodrasi; Simon Doclo
Regularized acoustic multi-channel equalization techniques, such as regularized partial multi-channel equalization based on the multiple-input/output inverse theorem (RPMINT), are able to achieve a high dereverberation performance in the presence of room impulse response perturbations but may lead to amplification of the additive noise. In this paper, two time-domain techniques aiming at joint dereverberation and noise reduction based on acoustic multi-channel equalization are proposed. The first technique, namely RPMINT for joint dereverberation and noise reduction (RPM-DNR), extends RPMINT by explicitly taking the noise statistics into account. In addition to the regularization parameter used in RPMINT, the RPM-DNR technique introduces an additional weighting parameter, enabling a trade-off between dereverberation and noise reduction. The second technique, namely multi-channel Wiener filter for joint dereverberation and noise reduction (MWF-DNR), takes both the speech and the noise statistics into account and uses the RPMINT filter to compute a dereverberated reference signal for the multi-channel Wiener filter. The MWF-DNR technique also introduces an additional weighting parameter, which now provides a trade-off between speech distortion and noise reduction. To automatically select the regularization and weighting parameters, for the RPM-DNR technique a novel procedure based on the L-hypersurface is proposed, whereas for the MWF-DNR technique two decoupled optimization procedures based on the L-curve are used. Extensive simulations demonstrate using instrumental measures that the RPM-DNR technique maintains the dereverberation performance of the RPMINT technique while improving its noise reduction performance. Furthermore, it is shown that the MWF-DNR technique yields a significantly better noise reduction performance than the RPM-DNR technique at the expense of a worse dereverberation performance.
international conference on acoustics, speech, and signal processing | 2017
Ina Kodrasi; Simon Doclo
Multi-channel methods for estimating the late reverberant power spectral density (PSD) rely on an estimate of the direction of arrival (DOA) of the speech source or of the relative early transfer functions (RETFs) of the target signal from a reference microphone to all microphones. The DOA and the RETFs may be difficult to estimate accurately, particularly in highly reverberant and noisy scenarios. In this paper we propose a novel multi-channel method to estimate the late reverberant PSD which does not require estimates of the DOA or RETFs. The late reverberation is modeled as an isotropic sound field and the late reverberant PSD is estimated based on the eigenvalues of the prewhitened received signal PSD matrix. Experimental results demonstrate the advantages of using the proposed estimator in a multi-channel Wiener filter for speech dereverberation, outperforming a recently proposed maximum likelihood estimator both when the DOA is perfectly estimated as well as in the presence of DOA estimation errors.
international conference on acoustics, speech, and signal processing | 2015
Ina Kodrasi; Daniel Marquardt; Simon Doclo
The objective of the speech distortion weighted multichannel Wiener filter (MWF) is to reduce background noise while controlling speech distortion. This can be achieved by means of a trade-off parameter, hence, selecting an optimal trade-off parameter is of crucial importance. Aiming at incorporating knowledge about the resulting speech distortion and noise power, in this paper we propose to compute the trade-off parameter as the point of maximum curvature of the parametric plot of noise power versus speech distortion. To determine a narrowband trade-off parameter, an analytical expression is derived for computing the point of maximum curvature, whereas to determine a broadband parameter an optimization routine is used. The speech distortion and the noise power terms can also be weighted in advance, e.g. based on perceptually motivated criteria. Experimental results show that using the proposed method instead of the MWF improves the intelligibility weighted SNR without significantly degrading the speech distortion.
international conference on acoustics, speech, and signal processing | 2013
Ina Kodrasi; Stefan Goetze; Simon Doclo
The objective of acoustic multichannel equalization is to design a reshaping filter that reduces reverberation, improves the perceptual speech quality, and is robust to errors in the estimated room impulse responses (RIRs). Although the channel shortening (CS) technique has been shown to be effective in achieving dereverberation, it may fail to preserve the natural shape of an RIR leading to speech quality degradation. Furthermore, CS yields multiple reshaping filters that satisfy its optimization criterion but result in a different perceptual speech quality. In this paper, we propose a robust perceptually constrained channel shortening technique (PeCCS) that resolves the selection ambiguity of CS and leads to joint dereverberation and speech quality preservation. Simulation results for erroneously estimated RIRs show that PeCCS preserves the perceptual speech quality and results in a higher reverberant tail suppression than other state-of-the-art techniques, such as CS and the regularized partial multichannel equalization technique based on the multiple-input/output inverse theorem (P-MINT).