Muhammad Ghulam
Toyohashi University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Muhammad Ghulam.
international conference on acoustics, speech, and signal processing | 2005
Muhammad Ghulam; Takashi Fukuda; Junsei Horikawa; Tsuneo Nitta
A pitch-synchronous (PS) auditory feature extraction method, based on ZCPA (zero-crossings peak-amplitudes), has been proposed (Ghulam, M. et al., Proc. ICSLP04, 2004) and was shown to be more robust than the conventional ZCPA (Kim, D.S. et al., IEEE Trans. Speech Audio Process., vol.7, no.1, p.55-69, 1999). We examine the effect of auditory masking, both simultaneous and temporal, in the PS-ZCPA method. We also observe the effect of varying the number of histogram bins on the way to find out the optimum parameters of the proposed method. Experimental results demonstrate the improved performance of the PS-ZCPA method achieved by embedding auditory masking into it; for example, with both the masking methods embedded, the performance increases to 73.71% from the 69.92% obtained without masking for PS-ZCPA, while it showed little improvement with an increased number of histogram bins.
IEICE Transactions on Information and Systems | 2006
Muhammad Ghulam; Takashi Fukuda; Kouichi Katsurada; Junsei Horikawa; Tsuneo Nitta
A pitch-synchronous (PS) auditory feature extraction method based on ZCPA (Zero-Crossings Peak-Amplitudes) was proposed previously and showed more robustness over a conventional ZCPA and MFCC based features. In this paper, firstly, a non-linear adaptive threshold adjustment procedure is introduced into the PS-ZCPA method to get optimal results in noisy conditions with different signal-to-noise ratio (SNR). Next, auditory masking, a well-known auditory perception, and modulation enhancement that simulates a strong relationship between modulation spectrums and intelligibility of speech are embedded into the PS-ZCPA method. Finally, a Wiener filter based noise reduction procedure is integrated into the method to make it more noise-robust, and the performance is evaluated against ETSI ES202 (WI008), which is a standard front-end for distributed speech recognition. All the experiments were carried out on Aurora-2J database. The experimental results demonstrated improved performance of the PS-ZCPA method by embedding auditory masking into it, and a slightly improved performance by using modulation enhancement. The PS-ZCPA method with Wiener filter based noise reduction also showed better performance than ETSI ES202 (WI008).
IEICE Transactions on Information and Systems | 2008
Mohammad Nurul Huda; Muhammad Ghulam; Takashi Fukuda; Kouichi Katsurada; Tsuneo Nitta
This paper describes a robust automatic speech recognition (ASR) system with less computation. Acoustic models of a hidden Markov model (HMM)-based classifier include various types of hidden factors such as speaker-specific characteristics, coarticulation, and an acoustic environment, etc. If there exists a canonicalization process that can recover the degraded margin of acoustic likelihoods between correct phonemes and other ones caused by hidden factors, the robustness of ASR systems can be improved. In this paper, we introduce a canonicalization method that is composed of multiple distinctive phonetic feature (DPF) extractors corresponding to each hidden factor canonicalization, and a DPF selector which selects an optimum DPF vector as an input of the HMM-based classifier. The proposed method resolves gender factors and speaker variability, and eliminates noise factors by applying the canonicalzation based on the DPF extractors and two-stage Wiener filtering. In the experiment on AURORA-2J, the proposed method provides higher word accuracy under clean training and significant improvement of word accuracy in low signal-to-noise ratio (SNR) under multi-condition training compared to a standard ASR system with mel frequency ceptral coeffient (MFCC) parameters. Moreover, the proposed method requires a reduced, two-fifth, Gaussian mixture components and less memory to achieve accurate ASR.
international conference on acoustics, speech, and signal processing | 2006
Muhammad Ghulam; Junsei Horikawa; Tsuneo Nitta
In this paper, we propose a novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR). A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously, and showed improved performance except while modulation enhancement was integrated together with Wiener filter (WF)-based noise reduction and auditory masking into it. However, since zero-crossing is not an auditory event, we propose a new pitch-synchronous peak-amplitude (PS-PA)-based method to make a feature extractor of ASR more auditory-like. We also examine the effect of WF-based noise reduction, modulation enhancement, and auditory masking into the proposed PS-PA method using Aurora-2J database. The experimental results showed the superiority of the proposed method over the PS-ZCPA method, and eliminated the problem due to the reconstruction of zero-crossings from modulated envelope. The highest relative performance over MFCC was achieved as 67.33% using the PS-PA method together with WF-based noise reduction, modulation enhancement, and auditory masking
international conference on acoustics, speech, and signal processing | 2002
Takaharu Sato; Muhammad Ghulam; Takashi Fukuda; Tsuneo Nitta
In this paper, we propose a novel confidence scoring method that is applied to N-best hypotheses output from an HMM-based classifier. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses and boundaries of all the monophones in the hypotheses. In the second pass, an SM(sub-space method)-based verifier tests the hypotheses by comparing confidence scores. We discuss how to convert a monophone similarity score of SM into a likelihood score, how to normalize the variations of acoustic quality in an utterance, and how to combine an HMM-based likelihood of word level and an SM-based likelihood of monophone level. In the experiments performed on speaker-independent word recognition, the proposed confidence scoring method significantly improves correct word recognition rate from 95.3% obtained by the standard HMM classifier to 98.0%.
IEICE Transactions on Information and Systems | 2006
Muhammad Ghulam; Kouichi Katsurada; Junsei Horikawa; Tsuneo Nitta
A novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR) is proposed. A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously and it showed improved performances except when modulation enhancement was integrated with Wiener filter (WF)-based noise reduction and auditory masking. However, since zero-crossing is not an auditory event, we propose a new pitch-synchronous peak-amplitude (PS-PA)-based method to render the feature extractor of ASR more auditory-like. We also examine the effects of WF-based noise reduction, modulation enhancement, and auditory masking in the proposed PS-PA method using the Aurora-2J database. The experimental results show superiority of the proposed method over the PS-ZCPA and other conventional methods. Furthermore, the problem due to the reconstruction of zero-crossings from a modulated envelope is eliminated. The experimental results also show the superiority of PS over PA in terms of the robustness of ASR, though PS and PA lead to significant improvement when applied together.
conference of the international speech communication association | 2004
Muhammad Ghulam; Takashi Fukuda; Junsei Horikawa; Tsuneo Nitta
conference of the international speech communication association | 2007
Mohammad Nurul Huda; Muhammad Ghulam; Junsei Horikawa; Tsuneo Nitta
Archive | 2007
Huda Mohammad Nurul; Muhammad Ghulam; Kouichi Katsurada; Yurie Iribe; Tsuneo Nitta
conference of the international speech communication association | 2005
Takashi Fukuda; Muhammad Ghulam; Tsuneo Nitta