Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yen-Liang Shue is active.

Publication


Featured researches published by Yen-Liang Shue.


Journal of the Acoustical Society of America | 2009

VOICESAUCE: A program for voice analysis.

Yen-Liang Shue; Patricia A. Keating; Chad Vicenik

VOICESAUCE is a new application, implemented in MATLAB, which provides automated voice measurements over time from audio recordings. The measures currently computed are F0, H1(*), H2(*), H4(*), H1(*)‐H2(*), H2(*)‐H4(*), H1(*)‐A1, H1(*)‐A2, H1(*)‐A3, energy, Cepstral Peak Prominence, F1–F4, and B1–B4, where (*) indicates that harmonic amplitudes are reported with and without corrections for formant frequencies and bandwidths [Iseli et al. (2006)]. Formant values are calculated using the Snack Sound Toolkit, while F0 is calculated using the STRAIGHT algorithm; harmonic spectra magnitudes are computed pitch‐synchronously. VOICESAUCE takes as input a folder of wav files, and for each input wav file produces a MATLAB file with values every millsecond for all measures. It can operate over the whole input file or over segments delimited by a PRAAT textgrid file. VOICESAUCE then takes these MATLAB outputs, optionally along with electroglottographic measurements obtained separately from PCQUIRERX, and provides con...


Journal of the Acoustical Society of America | 2012

Variability in the relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation.

Jody Kreiman; Yen-Liang Shue; Gang Chen; Markus Iseli; Bruce R. Gerratt; Juergen Neubauer; Abeer Alwan

Increases in open quotient are widely assumed to cause changes in the amplitude of the first harmonic relative to the second (H1*-H2*), which in turn correspond to increases in perceived vocal breathiness. Empirical support for these assumptions is rather limited, and reported relationships among these three descriptive levels have been variable. This study examined the empirical relationship among H1*-H2*, the glottal open quotient (OQ), and glottal area waveform skewness, measured synchronously from audio recordings and high-speed video images of the larynges of six phonetically knowledgeable, vocally healthy speakers who varied fundamental frequency and voice qualities quasi-orthogonally. Across speakers and voice qualities, OQ, the asymmetry coefficient, and fundamental frequency accounted for an average of 74% of the variance in H1*-H2*. However, analyses of individual speakers showed large differences in the strategies used to produce the same intended voice qualities. Thus, H1*-H2* can be predicted with good overall accuracy, but its relationship to phonatory characteristics appears to be speaker dependent.


international conference on acoustics, speech, and signal processing | 2006

Age-and Gender-Dependent Analysis of Voice Source Characteristics

Markus Iseli; Yen-Liang Shue; Abeer Alwan

The effects of age, gender, and vocal tract configurations on the glottal excitation signal are still only partially understood. In this paper we examine some of these effects, and show that the voice source parameters, such as fundamental frequency (F<sub>o</sub>), open quotient (related to H*<sub>1</sub> - H*<sub>2</sub>), and spectral tilt (related to H*<sub>1</sub> - A*<sub>3</sub>) are not only affected by age and gender but are also intercorrelated (the asterisk superscript denotes correction for the influence of various formants). Recordings of 92 male and female speakers from three age groups (8, 15, 20-39) are analyzed. The main observations are: for low-pitched talkers H*<sub>1</sub> - H* <sub>2</sub> (hence, the open quotient) is proportional to F<sub>o</sub>, while for high-pitched talkers H*<sub>1</sub>


international conference on acoustics, speech, and signal processing | 2008

The role of voice source measures on automatic gender classification

Yen-Liang Shue; Markus Iseli

H*<sub>2 </sub> is proportional to F<sub>1</sub> (high to low vowels) for F<sub>1 </sub> < 700 Hz. The parameter H*<sub>1</sub> - A*<sub>3</sub> showed a strong dependence on F<sub>2</sub> and F<sub>3</sub> for all talkers and age groups: increasing F<sub>2</sub> or F<sub>3</sub> yielded an increase in H*<sub>1</sub> - A*<sub>3</sub>. Spectral tilt was seen to be vowel dependent and for male talkers, spectral tilt changed dramatically with age. A better understanding of the dependencies of voice source parameters on age and gender will help improve voice source parameter estimation and analysis for a variety of speech processing and medical applications


international conference on acoustics, speech, and signal processing | 2010

A new voice source model based on high-speed imaging and its application to voice source estimation

Yen-Liang Shue; Abeer Alwan

Differences of physiological properties of the glottis and the vocal tract are partly due to age and/or gender differences. Since these differences are reflected in the speech signal, acoustic measures related to those properties can be helpful for automatic age and gender classification. In this paper, the focus is on the role of acoustic measures related to the voice source in automatic gender classification, implemented using support vector machines (SVMs). Acoustic measures of the vocal tract and the voice source were extracted from 3880 utterances spoken by 205 male and 160 female talkers (aged 8 to 39 years old). Formant frequencies and formant bandwidths were used as vocal tract measures, and open quotient and source spectral tilt correlates were used as voice source measures. Results show that the addition of voice source measures can help improve automatic gender classification results for most age groups.


Journal of the Acoustical Society of America | 2008

The relationship between open quotient and H1*‐H2*.

Jody Kreiman; Markus Iseli; Juergen Neubauer; Yen-Liang Shue; Bruce R. Gerratt; Abeer Alwan

There are numerous models of varying complexities which seek to efficiently represent the voice source signal. These models are typically based on data and observations which can come from air-flow masks, electroglottographs, mechanical systems, and the inverse-filtering of speech signals. The first part of this study examines observations from the high-speed imaging of the larynx and proposes a new source model, which is shown to provide a better fit for the observed data than existing models. The proposed source model is then used in an automatic source estimation application, based on methods introduced in an earlier study [1]. Results, on average, show that the proposed model provides a more accurate estimation of the source signal compared with the Liljencrants-Fant model.


Journal of the Acoustical Society of America | 2009

Voice quality variation with fundamental frequency in English and Mandarin.

Patricia A. Keating; Yen-Liang Shue

It is widely assumed that changes in open quotient (OQ) produce corresponding changes in H1*‐H2*, but empirical data supporting this relationship are scant. To provide such data, high‐speed video images and audio signals were simultaneously recorded from six speakers producing the vowel /i/ while varying F0 from high to low and voice quality from pressed to breathy. Across speakers, the observed relationship between OQ and H1*‐H2* was much weaker than generally assumed. Patterns of covariation also differed substantially from speaker to speaker. Estimation of harmonic amplitudes was complicated by difficulties in determining the frequency of F1 when F0 was high and by uncertainties regarding the F1 bandwidth in the presence of a persistent glottal chink. Use of analysis‐by‐synthesis allowed correction of formant values, but bandwidth estimation remains problematic and will be discussed further at the conference. [Work supported by NIH Grant DC01797 and NSF Grant BCS‐0720304.]


Journal of the Acoustical Society of America | 2005

Analysis of vowel and speaker dependencies of source harmonic magnitudes in consonant‐vowel utterances

Markus Iseli; Yen-Liang Shue; Abeer Alwan

Previous research has shown that F0 is positively related to H1*‐H2* across male speakers of English [Iseli et al. (2006)] and to H1‐H2 (after inverse filtering) within individual male speakers of Dutch [Swerts and Veldhuis (2001)]. That is, males who have overall higher‐pitched voices generally have overall higher values of H1*‐H2* (cross‐speaker relation), and as an individual male’s F0 goes up, H1‐H2 generally also goes up (within‐speaker relation). The present study investigates both of these relations, cross‐speaker and within‐speaker, for male and female speakers of English and Mandarin, and extends them to a large set of voice quality measures. The speech samples consist of repeated rising and falling tone sweeps, in which speakers began at a self‐selected comfortable pitch, and then swept either up or down in pitch to their highest or lowest comfortable pitch. The beginnings of the sweeps are tested for cross‐speaker relations, while the entire sweeps are tested for within‐speaker relations. VOICESAUCE, a new program for voice analysis, is used to extract F0, energy, cepstral peak prominence, formants and bandwidths, and a variety of harmonic amplitude measures. Many measures are shown to be strongly related to F0. [Work supported by NSF.]


Journal of the Acoustical Society of America | 2012

On parameterizing glottal area waveforms from high-speed images

Gang Chen; Jody Kreiman; Bruce R. Gerratt; Yen-Liang Shue; Abeer Alwan

It is assumed that voice quality characteristics are mainly manifested in the glottal excitation signal. As [Holmberg et al., J. Speech Hear. Res. 38, 1212–1223 (1995)] showed, there is a correlation between low‐frequency harmonic magnitudes of the glottal source spectrum and voice quality parameters. In this study, we assess the influence of vowel and speaker differences on the difference between the first and the second harmonic magnitudes, H1−H2. The improved harmonics correction formula introduced in [Iseli et al., Proceedings of ICASSP, Vol. 1 (2004), pp. 669–672] is used to estimate source harmonic magnitudes. H1−H2 is estimated for consonant‐vowel utterances where the vowel is one of the three vowels /a,i,u/ and the consonant is one of the six plosives in American English /b,p,d,t,g,k/. Several repetitions of each of the utterances, spoken by two male and two female talkers, are analyzed. Other measurements, such as fundamental frequency, F0, and energy are estimated pitch synchronously. Results ar...


Journal of the Acoustical Society of America | 2007

Age, sex, and vowel dependencies of acoustic measures related to the voice source

Markus Iseli; Yen-Liang Shue; Abeer Alwan

Because voice signals result from vocal fold vibration, perceptually-meaningful vibratory measures should quantify those aspects of vibration that correspond to differences in voice quality. In this study, glottal area waveforms were calculated from high-speed images of the vocal folds. Principal component analysis was applied to these waveforms to investigate the factors that vary with voice quality. Results showed that the first two principal components were significantly (p < 0.01) associated with the open quotient and the ratio of alternating-current to direct-current components. However, these conventional source measures, which are based on glottal flow, do not fully characterize observed variations in glottal area pulse shape across different glottal configurations, especially with respect to patterns of glottal closure that may be perceptually important. A source measure, the Source Dynamic Index (SDI), is proposed to characterize glottal area waveform variation for both complete and incomplete gl...

Collaboration


Dive into the Yen-Liang Shue's collaboration.

Top Co-Authors

Avatar

Abeer Alwan

University of California

View shared research outputs
Top Co-Authors

Avatar

Markus Iseli

University of California

View shared research outputs
Top Co-Authors

Avatar

Jody Kreiman

University of California

View shared research outputs
Top Co-Authors

Avatar

Gang Chen

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefanie Shattuck-Hufnagel

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Sun-Ah Jun

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge