Jibran Yousafzai
King's College London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jibran Yousafzai.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Jibran Yousafzai; Peter Sollich; Zoran Cvetkovic; Bin Yu
This paper proposes methods for combining cepstral and acoustic waveform representations for a front-end of support vector machine (SVM)-based speech recognition systems that are robust to additive noise. The key issue of kernel design and noise adaptation for the acoustic waveform representation is addressed first. Cepstral and acoustic waveform representations are then compared on a phoneme classification task. Experiments show that the cepstral features achieve very good performance in low noise conditions, but suffer severe performance degradation already at moderate noise levels. Classification in the acoustic waveform domain, on the other hand, is less accurate in low noise but exhibits a more robust behavior in high noise conditions. A combination of the cepstral and acoustic waveform representations achieves better classification performance than either of the individual representations over the entire range of noise levels tested, down to - 18-dB SNR.
information theory and applications | 2008
Jibran Yousafzai; Matthew Ager; Zoran Cvetkovic; Peter Sollich
Robustness of classification of isolated phoneme segments using discriminative and generative classifiers is investigated for the acoustic waveform and PLP speech representations. The two approaches used are support vector machines (SVMs) and mixtures of probabilistic PCA (MPPCA). While recognition in the PLP domain attains superb accuracy on clean data, it is significantly affected by mismatch between training and test noise levels. Classification in the high-dimensional acoustic waveform domain, on the other hand, is more robust in the presence of additive white Gaussian noise. We also show some results on the effects of custom-designed kernel functions for SVM classification in the acoustic waveform domain.
global engineering education conference | 2015
Jibran Yousafzai; Issam Damaj; Mohammed El Abd
A capstone design project is an extensive piece of work that requires creative activity and thinking. It provides a unique opportunity for students to demonstrate their abilities, skills, and experiences that are attained throughout a bachelor of engineering program. The learning outcomes of capstone projects mostly map to all student outcomes at the program level. This paper presents a unified assessment framework for capstone design courses which allows for sound evaluations of student performance and project qualities in addition to assessing student outcomes. The developed framework comprises criteria, indicators, extensive analytic rubrics, and a summative statistical formulation. The presented course and framework are supported by the results, analysis, and evaluation of a pilot study.
spoken language technology workshop | 2010
Jibran Yousafzai; Zoran Cvetkovic; Peter Sollich
A subband acoustic waveform front-end for robust speech recognition using support vector machines (SVMs) is developed. The primary issues of kernel design for subband components of acoustic waveforms and combination of the individual subband classifiers using stacked generalization are addressed. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of classification in frequency subbands: the subband classifier outperforms the cepstral classifiers in the presence of noise for signal-to-noise ratio (SNR) below 12dB.
international symposium on information theory | 2010
Jibran Yousafzai; Zoran Cvetkovic; Peter Sollich
In this paper, we investigate the robustness of phoneme classification to additive noise with hybrid features using support vector machines (SVMs). In particular, the cepstral features are combined with short term energy features of acoustic waveform segments to form a hybrid representation. The energy features are then taken into account separately in the SVM kernel, and a simple subtraction method allows them to be adapted effectively in noise. This hybrid representation contributes significantly to the robustness of phoneme classification and narrows the performance gap to the ideal baseline of classifiers trained under matched noise conditions.
international conference on digital signal processing | 2013
Jibran Yousafzai; Zoran Cvetkovic; Peter Sollich
We consider the effects of incorporating prior knowledge of features which correlate with phoneme identity as well as perceptual invariances into the design of SVM kernels for phoneme classification in high-dimensional spaces of acoustic waveforms of speech. To this end we explore products and linear combinations of polynomial and radial basis function kernels to design composite kernels which are invariant to waveform sign and time shift, and capture the dynamics of energy evolution in the time-frequency plane. Experiments show marked improvements in phoneme classification as a result of this custom kernel design. This demonstrates that even in high-dimensional feature spaces, careful kernel design based on prior knowledge of the problem domain can have significant payback.
global engineering education conference | 2016
Issam Damaj; Jibran Yousafzai
Accurate and simple assessment frameworks are of essential need in technical higher education. Although accurate results in most cases demand complicated setups, good compromises can lead to the desired assessment with simplicity. In this paper, we propose a unified framework for the assessment of student outcomes based on senior design experiences of undergraduate computer engineering students. Senior design experiences provide unique opportunities for students to demonstrate their abilities, skills, and experiences that are attained throughout a bachelor of engineering program. The proposed framework is built upon capstone design projects and senior design courses. The learning outcomes of senior design courses can be carefully designed to map to all student outcomes. Accordingly, senior design courses can lead to accurate assessments at the program level and within a simple setup of senior courses. The proposed framework allows for sound evaluations of student performance and project qualities. The developed framework comprises criteria, indicators, extensive analytic rubrics, and a summative statistical formulation. The framework is supported by evaluative and comparative analyses of course and student outcome assessments within a pilot study.
IEEE | 2012
Jibran Yousafzai; Matthew Ager; Zoran Cvetkovic; Peter Sollich
Automatic speech recognition (ASR) systems are yet to achieve the level of robustness inherent to speech recognition by the human auditory system. The primary goal of this paper is to argue that exploiting the redundancy in speech signals could be the key to solving the problem of the lack of robustness. This view is supported by our recent results on phoneme classification and recognition in the presence of noise which are surveyed in this paper.
2012 XIII International Symposium on Problems of Redundancy in Information and Control Systems | 2012
Jibran Yousafzai; Zoran Cvetkovic; Matthew Ager; Peter Sollich
Automatic speech recognition (ASR) systems are yet to achieve the level of robustness inherent to speech recognition by the human auditory system. The primary goal of this paper is to argue that exploiting the redundancy in speech signals could be the key to solving the problem of the lack of robustness. This view is supported by our recent results on phoneme classification and recognition in the presence of noise which are surveyed in this paper.
international conference on acoustics, speech, and signal processing | 2007
Jibran Yousafzai; Zoran Cvetkovic
The challenge of multichannel equalization for audio applications lies in the physical properties of the underlying multi-input/multi-output (MIMO) linear time-invariant systems which are generally non-minimum phase and exhibit extremely long impulse responses, thereby imposing a considerable computational burden on the equalization task particularly when iterative solutions are sought. In this paper we propose a computationally efficient non-iterative multi-channel equalization algorithm. The proposed algorithm is based on the fast Fourier transform (FFT) and allows for faster and considerably more accurate inversion of MIMO systems compared to traditional deconvolution algorithms and adaptive solutions. We address the accuracy and limitations of the proposed algorithm and present simulation results illustrating its performance.