Amin Fazel
Michigan State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Amin Fazel.
IEEE Circuits and Systems Magazine | 2011
Amin Fazel; Shantanu Chakrabartty
Even though the subject of speaker verification has been investigated for several decades, numerous challenges and new opportunities in robust recognition techniques are still being explored. In this overview paper we first provide a brief introduction to statistical pattern recognition techniques that are commonly used for speaker verification. The second part of the paper presents traditional and modern techniques which make real-world speaker verification systems robust in degradation due to the presence of ambient noise; channel variations, aging effects, and availability of limited training samples. The paper concludes with discussions on future trends and research opportunities in this area.
IEEE Transactions on Circuits and Systems | 2010
Amit Gore; Amin Fazel; Shantanu Chakrabartty
Localization of acoustic sources using miniature microphone arrays poses a significant challenge due to fundamental limitations imposed by the physics of sound propagation. With sub-wavelength distances between the microphones, resolving acute localization cues become difficult due to precision artifacts. In this paper we propose a framework which overcomes this limitation by integrating signal-measurement (analog-to-digital conversion) with statistical learning (bearing estimation). At the core of the proposed approach is a min-max stochastic optimization of a regularized cost function that embeds manifold learning within ¿¿ modulation. As a result, the algorithm directly produces a quantized sequence of the bearing estimates whose precision can be improved asymptotically similar to a conventional ¿¿ modulators. In this paper we present a hardware implementation of a miniature acoustic source localizer which comprises of: (a) a common-mode canceling microphone array and (b) a ¿¿ integrated circuit which produces bearing parameters. The parameters are then combined in an estimation procedure that can achieve a linear range from 0°-90°. Measured results from a prototype fabricated in a 0.5 ¿m CMOS process demonstrate that the proposed localizer can reliably estimate the bearing of an acoustic source with a resolution less than 2° while consuming less than 75 ¿W of power.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Amin Fazel; Shantanu Chakrabartty
In this paper, we present a novel speech feature extraction algorithm based on a hierarchical combination of auditory similarity and pooling functions. The computationally efficient features known as “Sparse Auditory Reproducing Kernel” (SPARK) coefficients are extracted under the hypothesis that the noise-robust information in speech signal is embedded in a reproducing kernel Hilbert space (RKHS) spanned by overcomplete, nonlinear, and time-shifted gammatone basis functions. The feature extraction algorithm first involves computing kernel based similarity between the speech signal and the time-shifted gammatone functions, followed by feature pruning using a simple pooling technique (“MAX” operation). In this paper, we describe the effect of different hyper-parameters and kernel functions on the performance of a SPARK based speech recognizer. Experimental results based on the standard AURORA2 dataset demonstrate that the SPARK based speech recognizer delivers consistent improvements in word-accuracy when compared with a baseline speech recognizer trained using the standard ETSI STQ WI008 DSR features.
IEEE Transactions on Signal Processing | 2010
Amin Fazel; Amit Gore; Shantanu Chakrabartty
Many source separation algorithms fail to deliver robust performance when applied to signals recorded using high-density sensor arrays where the distance between sensor elements is much less than the wavelength of the signals. This can be attributed to limited dynamic range (determined by analog-to-digital conversion) of the sensor which is insufficient to overcome the artifacts due to large cross-channel redundancy, nonhomogeneous mixing, and high-dimensionality of the signal space. This paper proposes a novel framework that overcomes these limitations by integrating statistical learning directly with the signal measurement (analog-to-digital) process which enables high fidelity separation of linear instantaneous mixtures. At the core of the proposed approach is a min-max optimization of a regularized objective function that yields a sequence of quantized parameters which asymptotically tracks the statistics of the input signal. Experiments with synthetic and real recordings demonstrate significant and consistent performance improvements when the proposed approach is used as the analog-to-digital front-end to conventional source separation algorithms.
international symposium on circuits and systems | 2009
Amin Fazel; Shantanu Chakrabartty
In this paper, we present a non-linear filtering approach for extracting noise-robust speech features that can be used in a speaker verification task. At the core of the proposed approach is a time-series regression using Reproducing Kernel Hilbert Space (RKHS) based methods that extracts discriminatory non-linear signatures while filtering out the non-informative noise components. A linear projection is then used to map the characteristics of the RKHS regression function into a linear-predictive vector which is then presented as an input to a back-end speaker verification engine. Experiments using the YOHO speaker verification corpus demonstrate that a recognition system trained using the proposed features demonstrate consistent improvements over an equivalent Mel-frequency cepstral coefficients (MFCCs) based verification system for signal-to-noise levels ranging from 0 – 30dB.
international symposium on circuits and systems | 2011
Amin Fazel; Shantanu Chakrabartty
In this paper we present a novel speech feature extraction algorithm based on sparse auditory coding and regression techniques in a reproducing kernel Hilbert space (RKHS). The features known as sparse kernel cepstral coefficients (SKCC) are extracted under the hypothesis that the noise-robust information in speech signal is embedded in a subspace spanned by overcomplete, regularized and normalized gamma-tone basis functions. After identifying the information bearing subspace, noise-robustness is achieved by sparsifying the SKCC features using simple thresholding. We show that computing the SKCC features involves correlating the speech signal with a pre-computed matrix, thus making the algorithm amenable to DSP based implementation. Speech recognition experiments using AURORA 2 dataset demonstrate that the SKCC features delivers consistent improvements in recognition performance over the state-of-the-art features under different noisy recording conditions.
international symposium on circuits and systems | 2010
Amin Fazel; Shantanu Chakrabartty
The performance of acoustic source separation algorithms significantly degrades when they applied to signals recorded using miniature microphone arrays where the distances between the microphone elements are much smaller than the wavelength of acoustic signals. This can be attributed to limited dynamic range (determined by analog-to-digital conversion) of the sensor which is insufficient to overcome the artifacts due to large cross-channel redundancy, non-homogeneous mixing and high-dimensionality of the signal space. This paper presents some of the recent progress in the area of sigma-delta learning which integrates statistical learning with analog-to-digital process and enables super-resolution auditory localization and separation. Experiments with synthetic and real recordings demonstrate significant and consistent performance improvements when the proposed approach is used as the analog-to-digital front-end to conventional source separation algorithms.
international symposium on circuits and systems | 2008
Amin Fazel; Shantanu Chakrabartty
Archive | 2012
Shantanu Chakrabartty; Amin Fazel
Archive | 2011
Amin Fazel; Shantanu Chakrabartty