Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hanjun Liu is active.

Publication


Featured researches published by Hanjun Liu.


IEEE Transactions on Biomedical Engineering | 2006

Enhancement of electrolarynx speech based on auditory masking

Hanjun Liu; Qin Zhao; Mingxi Wan; Supin Wang

Electrolarynx (EL) speech provides a valuable means of verbal communication for the laryngectomees. Yet EL speech tends to be less intelligible speech due to the presence of background noise. This paper addresses the issue of EL speech enhancement. The proposed approach takes into account the frequency-domain masking properties of the human auditory system for a subtractive-type enhancement process. Subtractive-type algorithms can efficiently reduce the radiated noise of EL speech but not to reduce the additive noise from the environment due to the use of fixed subtraction parameters. Considering the particular characteristics of EL speech, a new computationally efficient algorithm based on the perceptual weighting technique is developed to adapt the subtraction parameters. This leads to a significant reduction of the unnatural structure of the residual noise. Acoustic and perceptual experiments confirm that the enhanced EL speech is more pleasant to human listeners and the proposed algorithm results in improved performance over classical subtractive-type algorithms


Folia Phoniatrica Et Logopaedica | 2007

Effects of Place of Articulation and Aspiration on Voice Onset Time in Mandarin Esophageal Speech

Hanjun Liu; Manwa L. Ng; Mingxi Wan; Supin Wang; Yi Zhang

The ability of Mandarin esophageal speakers to distinguish between aspirated and unaspirated stops, and to distinguish between different places of articulation of stops were examined. Aspirated and unaspirated voiceless stops produced by normal laryngeal (NL) and standard esophageal (SE) speakers were studied. VOT values of the five different stops (/ph, th, kh, p, t/) of Mandarin followed by the vowel /a/ produced by NL and SE speakers were compared. Results from the perceptual experiment indicated that, while voiceless unaspirated stops produced by SE speakers were not signaled with a high level of accuracy, voiceless aspirated stops were perceived correctly by the listeners. Acoustic analysis showed that SE speakers consistently produced shorter VOT than NL speakers. Velar stops were associated with significantly longer VOT values than bilabial and alveolar stops in NL speakers. It was also found that, with the use of the PE segment, SE speakers were still able to use VOT to distinguish between aspirated and unaspirated stops, but they were unable to distinguish between bilabial, alveolar, and velar places of articulation.


Medical & Biological Engineering & Computing | 2003

Enhancement of electrolarynx speech using adaptive noise cancelling based on independent component analysis

Haijun Niu; Mingxi Wan; Supin Wang; Hanjun Liu

The electrolarynx provides a valuable means of verbal communication for people who cannot use their natural voice-production mechanism, but technology has changed very little since it was introduced in the 1950s. The presence of background noise degrades the resulting speech. In this study background noise was reduced by a new method, independent component analysis-based adaptive noise cancelling, which can remove noise components of the primary input signal based on statistical independence, by incorporating both second-order and higher-order statistics. The method shows better performance than the conventional least mean square algorithm. Acoustic analysis of the denoised electrolarynx speech revealed a significant reduction in the amount of background noise. Results from the perceptual evaluations indicated that the new filtering technique produced a noticeable improvement in the acceptability of the electrolarynx speech in a quiet environment (from 1.75 to 2.49, arbitrary units) or a noisy environment (from 0.59 to 1.82). In general, there was no significant improvement or degradation in intelligibility in the quiet environment (from 52.7 to 53.3). However, the processing did improve the intelligibility in a babble-noise environment (from 24.9 to 40.6). The improvement in acceptability and intelligibility may increase the communication ability of the user in daily situations.


Journal of the Acoustical Society of America | 2005

Acoustic characteristics of Mandarin esophageal speech

Hanjun Liu; Mingxi Wan; Supin Wang; Xiaodong Wang; Chunmei Lu

The present study attempted to investigate the acoustic characteristics of Mandarin laryngeal and esophageal speech. Eight normal laryngeal and seven esophageal speakers participated in the acoustic experiments. Results from acoustic analyses of syllables /ma/and /ba/ indicated that, F0, intensity, and signal-to-noise ratio of laryngeal speech were significantly higher than those of esophageal speech. However, opposite results were found for vowel duration, jitter, and shimmer. Mean F0, intensity, and word per minute in reading were greater but number of pauses was smaller in laryngeal speech than those in esophageal speech. Similar patterns of F0 contours and vowel duration as a function of tone were found between laryngeal and esophageal speakers. Long-time spectra analysis indicated that higher first and second formant frequencies were associated with esophageal speech than that with normal laryngeal speech.


Journal of the Acoustical Society of America | 2006

Application of spectral subtraction method on enhancement of electrolarynx speech

Hanjun Liu; Qin Zhao; Mingxi Wan; Supin Wang

Although electrolarynx (EL) serves as an important method of phonation for the laryngectomees, the resulting speech is of poor intelligibility due to the presence of a steady background noise caused by the instrument, even worse in the case of additive noise. This paper investigates the problem of EL speech enhancement by taking into account the frequency-domain masking properties of the human auditory system. One approach is incorporating an auditory masking threshold (AMT) for parametric adaptation in a subtractive-type enhancement process. The other is the supplementary AMT (SAMT) algorithm, which applies a cross-correlation spectral subtraction (CCSS) approach as a post-processing scheme to enhancing EL speech dealt with the AMT method. The performance of these two algorithms was evaluated as compared to the power spectral subtraction (PSS) algorithm. The best performance of EL speech enhancement was associated with the SAMT algorithm, followed by the AMT algorithm and the PSS algorithm. Acoustic and perceptual analyses indicated that the AMT and SAMT algorithms achieved the better performances of noise reduction and the enhanced EL speech was more pleasant to human listeners as compared to the PSS algorithm.


Folia Phoniatrica Et Logopaedica | 2006

Tonal Perceptions in Normal Laryngeal, Esophageal, and Electrolaryngeal Speech of Mandarin

Hanjun Liu; Mingxi Wan; Manwa L. Ng; Supin Wang; Chunmei Lu

The present study attempted to investigate if alaryngeal speakers of Mandarin Chinese could differentially produce the four tone levels of Mandarin: high-level, mid-rising, falling-rising and high-falling, as perceived by native listeners. Syllables /ma/ and /ba/ produced at four different tone levels by 8 normal laryngeal (NL), 7 standard esophageal (SE), and 8 electrolaryngeal (EL) speakers of Mandarin were perceived by 8 naïve listeners. Results from the listening experiments showed a higher percent correct identification of tone for NL speech than SE and EL speech. Perceptual data showed different patterns of tone confusion associated with the three types of speech. SE and EL speakers were not able to produce the four tone levels as accurately as were NL speakers. NL speakers achieved a near-perfect level of accuracy in signaling tonal contrasts. SE speakers produced the falling-rising and high-falling tones more accurately than high-level and mid-rising tones, but tonal confusions existed between mid-rising tone and falling-rising tone, and between high-level tone and high-falling tone. In EL phonation, high-level tone was produced more accurately than the other tones which were often misidentified as a high-level tone.


Auris Nasus Larynx | 2009

Long-term average spectral characteristics of Cantonese alaryngeal speech

Manwa L. Ng; Hanjun Liu; Qin Zhao; Paul K.Y. Lam

OBJECTIVE In Hong Kong, esophageal (SE), tracheoesophageal (TE), electrolaryngeal (EL), and pneumatic artificial laryngeal (PA) speech are commonly used by laryngectomees as a means to regain verbal communication after total laryngectomy. While SE and TE speech has been studied to some extent, little is known regarding the EL and PA sound quality. The present study examined the sound quality associated with SE, TE, EL, and PA speech, and compared with that associated with laryngeal (NL) speech by using long-term average speech spectra (LTAS). METHODS Continuous speech samples of reading a 136-word passage were obtained from NL, SE, TE, EL, and PA speakers of Cantonese. The alaryngeal speakers were all superior speakers selected from the New Voice Club of Hong Kong, which is a self-help organization for the laryngectomees in Hong Kong. TE speakers were fitted with Provox valve, and EL speakers used Servox-type electrolarynx. Speech samples were digitized at 20kHz and 16bits/sample by using Praat, based on which LTAS contours were developed. First spectral peak (FSP), mean spectral energy (MSE), and spectral tilt (ST) derived from the LTAS contours associated with different speaker groups were compared. RESULTS Data revealed all speakers generally exhibited similar LTA contours. However, PA speakers exhibited the lowest average FSP value and the greatest average MSE value. NL phonation was associated with a significantly greater ST value than alaryngeal speech of Cantonese. CONCLUSION The differences in FSP, MSE, and ST values in different speaker groups may be related to the different sound sources being used by the laryngectomees, and the difference in the way the sound source is coupled with the vocal tract system.


Folia Phoniatrica Et Logopaedica | 2005

Features of listeners affecting the perceptions of Mandarin electrolaryngeal speech

Hanjun Liu; Mingxi Wan; Supin Wang

The effect of age and gender of listeners on the perceptions of Mandarin electrolaryngeal speech was investigated. Sixty males and 40 females were categorized into five age groups (20–29, 30–39, 40–49, 50–59, and 60–70 years), who were regarded as naïve listeners for having no experience with electrolaryngeal speech. They were instructed to score acceptability of a passage and intelligibility of isolated words and embedded words. The results revealed no gender effects but significant age effects on the perceptual evaluation. It was more difficult for the 50–59 and 60–70 groups to understand electrolaryngeal speech. The results were also analyzed for tonal and segmental errors, and errors of tone alone were found to occur more often than segmental errors. In addition, a preliminary study was presented for the perceptions of the four Mandarin tones. Higher percent correct identification was found for the high-level tone compared to the other three tones.


Journal of the Acoustical Society of America | 2005

Voice onset time in Mandarin esophageal speech

Manwa L. Ng; Hanjun Liu

As an important perceptual cue for voicing and aspiration of stops, voice onset time (VOT) is mainly determined by the aerodynamic interaction between the intraoral and subglottal regions. However, since the PE segment serves as a new vibratory source in esophageal phonation, aerodynamic events are very different from laryngeal phonation. VOT associated with esophageal speech of English has been reported previously. However, few studies have reported VOT characteristics of esophageal speech of tone languages. The present study will investigate the possible VOT difference between esophageal and normal laryngeal speakers of Mandarin Chinese. Seven superior esophageal speakers and 7 normal laryngeal speakers will participate in the present investigation. They will be native male speakers of Mandarin Chinese. The participants will produce the syllable /ta/ embedded in a carrier phrase at a comfortable loudness level for three times. VOT values will be measured from a time domain waveform. With reference to a ...


Auris Nasus Larynx | 2007

Electrolarynx in voice rehabilitation

Hanjun Liu; Manwa L. Ng

Collaboration


Dive into the Hanjun Liu's collaboration.

Top Co-Authors

Avatar

Mingxi Wan

Xi'an Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Supin Wang

Xi'an Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Manwa L. Ng

University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Qin Zhao

Xi'an Jiaotong University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yi Zhang

Xi'an Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Manwa L. Ng

University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge