Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hamid Reza Sharifzadeh is active.

Publication


Featured researches published by Hamid Reza Sharifzadeh.


IEEE Transactions on Biomedical Engineering | 2010

Reconstruction of Normal Sounding Speech for Laryngectomy Patients Through a Modified CELP Codec

Hamid Reza Sharifzadeh; Ian Vince McLoughlin; Farzaneh Ahmadi

Whispered speech can be useful for quiet and private communication, and is the primary means of unaided spoken communication for many people experiencing voice-box deficiencies. Patients who have undergone partial or full laryngectomy are typically unable to speak anything more than hoarse whispers, without the aid of prostheses or specialized speaking techniques. Each of the current prostheses and rehabilitative methods for post-laryngectomized patients (primarily oesophageal speech, tracheo-esophageal puncture, and electrolarynx) have particular disadvantages, prompting new work on nonsurgical, noninvasive alternative solutions. One such solution, described in this paper, combines whisper signal analysis with direct formant insertion and speech modification located outside the vocal tract. This approach allows laryngectomy patients to regain their ability to speak with a more natural voice than alternative methods, by whispering into an external prosthesis, which then, recreates and outputs natural-sounding speech. It relies on the observation that while the pitch-generation mechanism of laryngectomy patients is damaged or unusable, the remaining components of the speech production apparatus may be largely unaffected. This paper presents analysis and reconstruction methods designed for the prosthesis, and demonstrates their ability to obtain natural-sounding speech from the whisper-speech signal using an external analysis-by-synthesis processing framework.


asia pacific conference on circuits and systems | 2008

Analysis-by-synthesis method for whisper-speech reconstruction

Farzaneh Ahmadi; Ian Vince McLoughlin; Hamid Reza Sharifzadeh

In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec. Furthermore, since restoring pitch to whispered speech requires some considerations of quality and accuracy, spectral enhancements are required in terms of formant shifting (LSPs modification) and pitch injection based on voiced/unvoiced decision. Spectral shifting is accomplished through line-spectral pair adjustment. Implementing such methods by using the popular CELP codec allows integration of the technique with any modern speech applications and devices. Subjective testing results are presented to determine the effectiveness of the technique.


ACM Transactions on Accessible Computing | 2015

Reconstruction of Phonated Speech from Whispers Using Formant-Derived Plausible Pitch Modulation

Ian Vince McLoughlin; Hamid Reza Sharifzadeh; Su Lim Tan; Jingjie Li; Yan Song

Whispering is a natural, unphonated, secondary aspect of speech communications for most people. However, it is the primary mechanism of communications for some speakers who have impaired voice production mechanisms, such as partial laryngectomees, as well as for those prescribed voice rest, which often follows surgery or damage to the larynx. Unlike most people, who choose when to whisper and when not to, these speakers may have little choice but to rely on whispers for much of their daily vocal interaction. Even though most speakers will whisper at times, and some speakers can only whisper, the majority of today’s computational speech technology systems assume or require phonated speech. This article considers conversion of whispers into natural-sounding phonated speech as a noninvasive prosthetic aid for people with voice impairments who can only whisper. As a by-product, the technique is also useful for unimpaired speakers who choose to whisper. Speech reconstruction systems can be classified into those requiring training and those that do not. Among the latter, a recent parametric reconstruction framework is explored and then enhanced through a refined estimation of plausible pitch from weighted formant differences. The improved reconstruction framework, with proposed formant-derived artificial pitch modulation, is validated through subjective and objective comparison tests alongside state-of-the-art alternatives.


Archive | 2009

Regeneration of speech in voice-loss patients

Hamid Reza Sharifzadeh; Ian Vince McLoughlin; Farzaneh Ahmadi

This paper considers regeneration of natural sounding speech from whisper-speech, produced by patients with vocal tract lesions affecting the glottis. Such reconstruction is important for both total and partial laryngectomy patients to improve on the monotonous robotized sound typical of electrolarynx devices.


Archive | 2010

Speech Rehabilitation Methods for Laryngectomised Patients

Hamid Reza Sharifzadeh; Ian Vince McLoughlin; Farzaneh Ahmadi

Rehabilitation of the ability to speak in a natural sounding voice, for patients who suffer larynx and voice box deficiencies, has long been a dream for both patients and researchers working in this field. Removal of, or damage to, the voice box in a surgical operation such as laryngectomy, affects the pitch generation mechanism of the human voice production system. Post-laryngectomised patients thus exhibit hoarse, whisper like and sometimes less intelligible speech – it is obviously different to fully phonated speech, and may lack many of the distinctive characteristics of the patients normal voice. However these patients often retain the ability to whisper in a similar way to normal speakers. This chapter firstly discusses how the laryngectomy affects speech before briefly reviewing the three common methods of speech rehabilitation in such patients. It then presents, as a fourth method, a engineering approach to providing laryngectomy patients the capacity to speak with a more natural sounding voice. As a side effect, this allows them to conveniently use a mobile phone for communications. The approach is non-invasive and uses only auditory information, performing analysis, formant insertion, spectral enhancements and formant smoothing within the reconstruction process. In effect, natural sounding speech is obtained from spoken whispers, without recourse to surgery. The method builds upon our previously published works of using an analysis-by-synthesis approach for voice reconstruction.


international conference on signal processing | 2007

Speech recognition engine adaptions for smart home dialogues

Ian Vince McLoughlin; Hamid Reza Sharifzadeh

This paper considers the needs of speech recognition-based dialogues for smart homes, and proposes a structure to allow effective speech recognition in such circumstances. A smart home system based around the Freevo home theatre platform, and the Sphinx2 speech recognition engine has been designed to implement the features and grammar optimization described. In addition, the particular requirements of complexity and size minimisation for embedded systems are discussed. In practical terms we propose a method of continuously variable vocabulary size to maintain required speech recognition accuracy in a smart home context.


international conference on pattern recognition | 2014

Vein Pattern Visualization through Multiple Mapping Models and Local Parameter Estimation for Forensic Investigation

Hamid Reza Sharifzadeh; Hengyi Zhang; Adams Wai-Kin Kong

Forensic investigation methods based on some human traits, including fingerprint, face, and palm print, have been developed significantly, but some major aspects of particular crimes such as child pornography still lack of notable research efforts. Unlike common forensic identification methods, techniques for identifying criminals in child pornographic images should be developed based on partial non-facial skin observable in the images because criminals always hide their faces. Few methods published recently have shown the potential of vein patterns visualized from color images as a criminal and victim identification tool. However, these methods have two weaknesses: 1) they use single model to visualize vein patterns hidden in color images, which neglects the diversity of skin properties and 2) even though their parameters are determined automatically by an optimization, they do not adapt to fit local image characteristics. To address these weaknesses, this paper proposes an algorithm composed of a bank of mapping models which transform color images to near infrared (NIR) images for visualizing vein patterns and a local parameter estimation scheme for handling different image characteristics in different regions. Imbalanced data regression is also used to systematically construct the model bank. The proposed algorithm is examined and compared with the previous methods on a database of 920 thigh images from 230 subjects. It outperforms the previous methods.


international symposium on chinese spoken language processing | 2010

Toward a comprehensive vowel space for whispered speech

Hamid Reza Sharifzadeh; Ian Vince McLoughlin; Martin J. Russell

Whispered speech, as a means of communication, has received prior research efforts on different aspects including some rough estimations/studies on the main vowels /I,ɛ,æ,Λ,℧/. Despite these efforts, it still lacks a classic vowel space determination similar to the vowel acoustic measurements for normal speech which have played a central role in the development and testing of speech recognition and processing theories over the past few decades. The purpose of this study is to redress the shortfall through a preliminary vowel formant space for whispered speech, while comparing the results with corresponding phonated samples. Since the study was conducted using speakers from Birmingham, the analysis also briefly considers the possible effect of British West Midlands (WM) accent in comparison with Standard English (RP) accent. Thus, the paper presents the analysis of formant data showing differences between normal and whispered speech while also considering accentual effect on whispered speech.


international conference on telecommunications | 2010

Spectral Enhancement of Whispered Speech Based on Probability Mass Function

Hamid Reza Sharifzadeh; Ian Vince McLoughlin; Farzaneh Ahmadi

Whispered speech can be effectively used for quiet and private communications over mobile phones and is also the communication means for ENT patients under a regime of voice rest. The reconstruction of natural sounding speech from such whispers can be useful for several types of application across different scientific fields ranging from communications to biomedical engineering. Despite the useful applications for a such technology, the reconstruction of natural speech from whispers has received relatively little research effort to date. This paper presents novel methods for spectral enhancement and formant smoothing with the aim of attaining more natural sounding speech within the reconstruction process. The proposed approach uses a probability mass-density function to identify a reliable formant trajectory through whispers and apply vocal modifications accordingly. Subjective evaluation experiments were performed, and are reported, to assess the performance of the techniques. A method for the near real-time conversion of whispers to normal phonated speech through a modified CELP codec has been discussed in our previously published work which, the proposed formant modification approach in this paper builds upon.


IEEE Transactions on Audio, Speech, and Language Processing | 2016

A statistical inverse problem approach to online secondary path modeling in active noise control

Iman Tabatabaei Ardekani; Jari P. Kaipio; Alireza Nasiri; Hamid Reza Sharifzadeh; Waleed H. Abdulla

This paper recasts the problem of online secondary path modeling in the form of a statistical inverse problem. A statistical and, in particular, a Bayesian approach towards secondary path modeling is developed and the computational issues that emerge from this approach are discussed. All signals and parameters are modeled as random variables and the degree of information concerning them is coded in their probability density functions. An abstract solution is formulated in the form of a probability density function for the secondary path model. For extracting point estimates, common statistical estimation methods are investigated. It is shown that maximum likelihood estimation is not stable; however, Bayesian method of maximum a posteriori gives a reliable solution. An adaptive algorithm is then developed to compute this solution in a computationally efficient manner. This algorithm has three advantages, compared to the traditional secondary path modeling algorithms. First, it does not cause any interference with the main active noise control algorithm. Second, it does not require any additive-noise to be injected into the secondary path. Third, it does not require any off-line initiation. The convergence of the proposed algorithm is analyzed theoretically. The validity of the theoretical results is investigated by using computer simulation. Finally successful integration of the proposed algorithm into a real-time ANC system is reported.

Collaboration


Dive into the Hamid Reza Sharifzadeh's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Farzaneh Ahmadi

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hossein Sarrafzadeh

Unitec Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jingjie Li

University of Science and Technology of China

View shared research outputs
Researchain Logo
Decentralizing Knowledge