Mahdi Triki
Philips
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mahdi Triki.
international conference on acoustics, speech, and signal processing | 2009
Mahdi Triki; Dirk T. M. Slock
A multiple fundamental frequency estimator is a key building block in music transcription and indexing operations. However, systems trying to perform this task tend to be very complex [1]. Indeed, music transcription requires an analysis accounting for both physical and psycho-acoustical matters. In this work, we propose a physically-motivated audio signal analysis followed by an auditory-based selection. The audio signal model allows for a better time/frequency resolution tradeoff, while the auditory distance discards the redundant/non-relevant information. No prior information on the musical instrument, musical genre, and/or maximum polyphony are needed. Simulations show that the proposed technique achieves good transcription results for a variety of string and wind instruments. The proposed scheme is also shown to be robust in the presence of noise, percussive sounds and in unbalanced Signal-to-Interference Ratio (SIR) situations.
multimedia signal processing | 2008
Mahdi Triki; Dirk T. M. Slock; Ahmed Triki
A key building block in music transcription and indexing operations is the decomposition of music signals into notes. We model a note signal as a periodic signal with (slow) frequency-selective amplitude modulation and global time warping. Time-varying frequency-selective amplitude modulation allows the various harmonics of the periodic signal to decay at different speeds. Time-warping allows for some limited global frequency modulation. The bandlimited variation of the frequency-selective amplitude modulation and of the global time warping gets expressed through a subsampled representation and parametrization of the corresponding signals. Assuming additive white Gaussian noise, a maximum likelihood approach is proposed for the estimation of the model parameters and the optimization is performed in an iterative (cyclic) fashion that leads to a sequence of simple least-squares problems.
international conference on acoustics, speech, and signal processing | 2009
Mahdi Triki; Kees Janse
Speech enhancement is the processing of speech signals in order to improve one or more perceptual aspects. If the statistics of the clean signal and the noise process are explicitly known, enhancement could be ‘optimally’ accomplished (minimizing a distortion measure between the clean and the estimated signals). In practice however, these statistics are not explicitly available, and the overall enhancement accuracy critically depends on the estimation quality of the unknown statistics. The estimation of noise (and speech) statistics is particularly a critical issue and a challenging problem under non-stationary noise conditions. In this paper, we investigate the noise floor estimation using subspace decomposition. We examine the speech DFT rank limited assumption. We propose a new noise PSD estimation scheme (called Minimum Subspace Noise Tracking (MSNT)). The proposed scheme can be interpreted as a combination of the subspace structure and the minimum statistics tracking. Experimental investigation of the MSNT tracking performance and comparison with the state of the art is also presented.
international conference on acoustics, speech, and signal processing | 2009
Mahdi Triki
One fundamental non-stationary scenario involves a time-varying system in which the cross-correlation between the input signal and the desired response is time-varying. This case occurs in speech enhancement applications, where the optimal solution is time-varying due to the speech signal non-stationarity. Adaptive filtering performance analysis of time-varying systems is crucial to further understand the tracking behavior and to ‘optimally’ design the update schemes. In this work, we investigate the tracking performance of the adaptive GSC applied for speech denoising. First, we interpret the noise cancellation in terms of non-stationary system identification. Then, we formulate the RLS adaptation as a filtering operation on the (time-varying) optimal filter and the instantaneous gradient noise (induced by the measurement noise). Under some structural assumptions, we derive an expression for the Excess Mean Squared Error (EMSE). Monte-Carlo simulations show that the proposed expression allows for a good prediction of the EMSE, and outperforms the state-of-the-art approximations.
2008 Hands-Free Speech Communication and Microphone Arrays | 2008
Mahdi Triki; Dirk T. M. Slock
We1 consider the blind multichannel dereverberation problem for a single source. We have shown before [5] that the single-input multi- output (SIMO) reverberation filter can be equalized blindly by applying multivariate linear prediction (LP) to its output (after SISO input pre-whitening). In this paper, we investigate the LP-based dereverberation in a noisy environment, and/or under acoustic channel length underestimation. Considering ambient noise and late reverberation as additive noises, we propose to introduce a postfilter that transforms the multivariate prediction filter into a somewhat longer equalizer. The postfilter allows to equalize to non-zero delay. Both MMSE-ZF and MMSE design criteria are considered here for the postfilter. Simulations show that the proposed scheme is robust in noisy environments and channel length underestimation, and performs better compared to the classic delay-&-predict equalizer and the delay-&-sum beamformer.
2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays | 2011
Mahdi Triki
Various speech enhancement techniques (e.g. noise suppression, dereverberation) rely on the knowledge of the statistics of the clean signal and the noise process. In practice, however, these statistics are not explicitly available, and the overall enhancement accuracy critically depends on the estimation quality of the unknown statistics. With this respect, subspace based approaches have shown to allow for reduced estimation delay and perform a good tracking vs. final misadjustment tradeoff [3, 5]. For an accurate noise non-stationarity tracking, subspace schemes have the challenge to estimate the correlation of the observed signal from a limited number of samples. In this paper, we propose and investigate a median-search approach to update the noise floor estimate, and alleviate estimation noise artifacts. Experimental investigation of the tracking bias, performance and a comparison with some state-of-the-art techniques are also presented.
international conference on acoustics, speech, and signal processing | 2010
Mahdi Triki; Dirk T. M. Slock
A refined estimation and tracking of the instantaneous frequency variations is desirable for a variety of audio applications (audio coding, singer segregation, music transcription and transformations, etc). In the present paper, we extend the periodic modeling with global amplitude and frequency modulation approach [16]. We introduce a first order approximation producing an additive term involving the derivative of the ‘normalized waveform’ multiplied by the instantaneous FM signal. The variations of the global FM get expressed through a subsampled representation and estimated using a simple least-squares scheme.
Archive | 2010
Mahdi Triki; Cornelis Pieter Janse
Archive | 2012
Mahdi Triki
conference of the international speech communication association | 2010
Mahdi Triki; Kees Janse