Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sarmad Malik is active.

Publication


Featured researches published by Sarmad Malik.


IEEE Transactions on Audio, Speech, and Language Processing | 2012

State-Space Frequency-Domain Adaptive Filtering for Nonlinear Acoustic Echo Cancellation

Sarmad Malik; Gerald Enzner

In this paper, we address adaptive acoustic echo cancellation in the presence of an unknown memoryless nonlinearity preceding the echo path. We approach the problem by considering a basis-generic expansion of the memoryless nonlinearity. By absorbing the coefficients of the nonlinear expansion into the unknown echo path, the cascade observation model is transformed into an equivalent multichannel structure, which we further augment with a multichannel first-order Markov model. For the resulting multichannel state-space model, we then derive a recursive Bayesian estimator that takes the form of an adaptive Kalman algorithm in the discrete Fourier transform (DFT) domain. We show that such a recursive estimator can be realized via a stable and structurally efficient multichannel state-space frequency-domain adaptive filter. We demonstrate that our algorithm, which stems from a contained framework, provides effective nonlinear echo cancellation in the presence of continuous double-talk, varying degree of nonlinear distortion, and changes in the echo path.


international conference on acoustics, speech, and signal processing | 2011

Fourier expansion of hammerstein models for nonlinear acoustic system identification

Sarmad Malik; Gerald Enzner

We consider the task of acoustic system identification, where the input signal undergoes a memoryless nonlinear transformation before convolving with an unknown linear system. We focus on the possibility of modeling the nonlinearity with different basis functions, namely the established power series and the proposed Fourier expansion. In this work the unknown coefficients of generic basis functions are merged with the unknown linear system to obtain an equivalent multichannel structure. We use a multichannel DFT-domain algorithm for learning the underlying coefficients of both types of basis functions. We show that the Fourier modeling achieves faster convergence and better learning of the underlying nonlinearity than the polynomial basis.


international conference on acoustics, speech, and signal processing | 2010

Online maximum-likelihood learning of time-varying dynamical models in block-frequency-domain

Sarmad Malik; Gerald Enzner

A linear dynamical model can be used to describe the evolution of an unknown system in noisy conditions. However, in most applications model parameters of a dynamical system are not known a priori, bringing into question the optimality of traditional state-only estimators. In this paper, we consider block-frequency-domain dynamical models and formulate an optimal framework for low-latency joint state and parameter estimation. We show that the resulting variational expectation-maximization algorithm in the block-frequency-domain offers a comprehensive and efficient solution for the joint estimation task.


IEEE Transactions on Audio, Speech, and Language Processing | 2014

Variational Bayesian inference for multichannel dereverberation and noise reduction

Dominic Schmid; Gerald Enzner; Sarmad Malik; Dorothea Kolossa; Rainer Martin

Room reverberation and background noise severely degrade the quality of hands-free speech communication systems. In this work, we address the problem of combined speech dereverberation and noise reduction using a variational Bayesian (VB) inference approach. Our method relies on a multichannel state-space model for the acoustic channels that combines frame-based observation equations in the frequency domain with a first-order Markov model to describe the time-varying nature of the room impulse responses. By modeling the channels and the source signal as latent random variables, we formulate a lower bound on the log-likelihood function of the model parameters given the observed microphone signals and iteratively maximize it using an online expectation-maximization approach. Our derivation yields update equations to jointly estimate the channel and source posterior distributions and the remaining model parameters. An inspection of the resulting VB algorithm for blind equalization and channel identification (VB-BENCH) reveals that the presented framework includes previously proposed methods as special cases. Finally, we evaluate the performance of our approach in terms of speech quality, adaptation times, and speech recognition results to demonstrate its effectiveness for a wide range of reverberation and noise conditions.


IEEE Signal Processing Letters | 2011

Recursive Bayesian Control of Multichannel Acoustic Echo Cancellation

Sarmad Malik; Gerald Enzner

We present a novel recursive Bayesian method in the DFT-domain to address the multichannel acoustic echo cancellation problem. We model the echo paths between the loudspeakers and the near-end microphone as a multichannel random variable with a first-order Markov property. The incorporation of the near-end observation noise, in conjunction with the multichannel Markov model, leads to a multichannel state-space model. We derive a recursive Bayesian solution to the multichannel state-space model, which turns out to be well suited for input signals that are not only auto-correlated but also cross-correlated. We show that the resulting multichannel state-space frequency-domain adaptive filter (MCSSFDAF) can be efficiently implemented due to the submatrix-diagonality of the state-error covariance. The filter offers optimal tracking and robust adaptation in the presence of near-end noise and echo path variability.


international conference on acoustics, speech, and signal processing | 2012

An expectation-maximization algorithm for multichannel adaptive speech dereverberation in the frequency-domain

Dominic Schmid; Sarmad Malik; Gerald Enzner

This paper presents an online dereverberation algorithm that is derived within the maximum-likelihood expectation-maximization (ML-EM) framework. We formulate an overlap-save observation model for the multichannel blind problem in the DFT-domain. The modeling of acoustic channel impulse responses as random variables with a first-order Markov property facilitates the ensuing algorithm to cope with time-varying conditions. We then show that the ML-EM learning rules for the multichannel state-space model at hand take the form of a recursive posterior estimator for the channels, followed by an equalization stage for recovering the speech signal subject to an expectation with respect to the estimated channel posterior. Our derivation thus results in an iterative ML algorithm for blind equalization and channel identification (ML-BENCH) which comprises two distinct and coupled subsystems. The dereverberation performance of the proposed system is evaluated by considering spectrograms and instrumental quality measures.


international conference on acoustics, speech, and signal processing | 2012

Variational Bayesian inference for nonlinear acoustic echo cancellation using adaptive cascade modeling

Sarmad Malik; Gerald Enzner

In this contribution, we present a variational Bayesian framework for the acoustic echo cancellation problem in the presence of a memoryless loudspeaker nonlinearity. We pursue a cascade modeling strategy, where first-order Markov models are described over the acoustic echo path and the nonlinear expansion coefficients. An iterative algorithm is then derived that learns the posterior on the echo path and the nonlinear coefficients to fit the evidence distribution. We show that the formulated variational Bayesian state-space frequency-domain adaptive filter is efficiently implementable and performs joint learning of the echo path and the loudspeaker nonlinearity. The algorithm exploits the internal exchange of the reliability information, resulting in effective linear and nonlinear echo cancellation.


IEEE Transactions on Signal Processing | 2013

A Variational Bayesian Learning Approach for Nonlinear Acoustic Echo Control

Sarmad Malik; Gerald Enzner

In this work, we present novel Bayesian algorithms for acoustic echo cancellation and residual echo suppression in the presence of a memoryless loudspeaker nonlinearity. The system nonlinearity is modeled using a basis-generic nonlinear expansion. This allows us to express the microphone observation in the DFT domain in terms of the nonlinear-expansion coefficients and the acoustic echo path. We augment the observation model with first-order Markov models for the echo-path vector and the nonlinear-expansion coefficients to arrive at a composite state-space model. The echo path vector and each nonlinear-expansion coefficient are designated as the unknown random variables in our Bayesian model. The posterior estimators for the random variables and the learning rules for the a priori unknown model parameters are then derived via the maximization of the variational lower bound on the log likelihood. We further show that a Bayesian post-filter for residual echo suppression can be derived by optimizing a minimum-mean-square error (MMSE) cost function subject to marginalization with respect to the posteriors estimated in the echo cancellation stage. The effectiveness of the approach is supported by simulation results and an analysis using instrumental performance measures.


IEEE Signal Processing Letters | 2012

A State-Space Cross-Relation Approach to Adaptive Blind SIMO System Identification

Sarmad Malik; Dominic Schmid; Gerald Enzner

In this work, we address blind single-input multiple-output (SIMO) system identification in conjunction with dynamical modeling of the underlying system. A multichannel cross-relation observation model in the DFT domain is employed to derive a blind adaptive algorithm that recursively learns the posterior distribution on the unknown SIMO system. The proposed algorithm inherently incorporates the time-varying nature of the channels and a representation of the observation noise. We show that the resulting cross-relation state-space frequency-domain adaptive filter (CR-SSFDAF), owing to its stable and diagonalized structure and near-optimal step-size control, can be efficiently operated in time-varying and noisy conditions.


Acoustic Signal Enhancement; Proceedings of IWAENC 2012; International Workshop on | 2012

A Maximum A Posteriori Approach to Multichannel Speech Dereverberation and Denoising

Dominic Schmid; Sarmad Malik; Gerald Enzner

Collaboration


Dive into the Sarmad Malik's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge