Janet Slifka
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Janet Slifka.
Journal of Phonetics | 2001
Helen M. Hanson; Kenneth N. Stevens; Hong-Kwang Jeff Kuo; Marilyn Y. Chen; Janet Slifka
Abstract The earliest models of phonation were based on the assumption that the glottis is closed during a part of the vibration cycle, that is, the phonation is modal. Nonmodal phonation, however, commonly occurs not only for disordered voice but also for normal voices, which often exhibit a breathy quality or irregular vibration. In this paper, we review recent work that examines acoustic data and models of nonmodal phonation in both normal and disordered voice. We first describe acoustic models that predict how the glottal source varies from modal phonation to phonation resulting from glottal configurations that are partially abducted, including a posterior glottal opening. These models are applied first to vowels of nondisordered adults, and, later in the paper, to vowels produced by adults with dysarthria. We also present results from a study in which a modified version of the two-mass model is used to resolve a seeming conflict among aerodynamic and acoustic data collected from adult female subjects with vocal-fold nodules. Some discussion of nonmodal phenomena that occur due to prosodic and emotional influences is included. Overall, it appears that current models of modal phonation can be extended to include a range of nonmodal phonation types.
Journal of the Acoustical Society of America | 2005
Janet Slifka
In a landmark‐based model of lexical access [K. N. Stevens, J. Acoust. Soc. Am. 111, 1872–1891 (2002)], the presence of a vowel is marked by a peak in energy in the first formant region. However, when a vowel is followed by a schwa, the schwa frequently appears as a shoulder on the peak associated with the first vowel [W. Howitt, MIT (2000)]; two landmarks are not present. The purpose of this study is to examine duration and F2 movement as possible cues to the presence of a vowel‐schwa sequence for [+high, +front] vowels. This subset of vowels presents at least two challenges to the detection of a vowel‐schwa sequence: (1) duration is expected to contribute to the difference between /i/ and /I/, and (2) an F2 off‐glide toward schwa is expected for /I/ in American English. For 613 tokens from the phonetically labeled TIMIT database, equally distributed between /i■/, /i/, and /I/, a measure of F2 curvature is a stronger cue than duration in classifying the tokens. Using F2 curvature, over 93% of the tokens ...
Journal of the Acoustical Society of America | 2008
Janet Slifka
This work is part of an ongoing study to characterize respiratory system involvement during the generation of pauses in connected speech. At pauses, the speaker takes a breath or the speaker does not take a breath. Previously reported results for read speech observed that without‐breath pauses were generated with a sharp movement toward net inspiratory effort followed by a sharp return toward increased net expiratory effort [J. Slifka, In Dynamics of Speech Production and Perception (IOS, 2006), pp. 45‐58]. In the present work, 52 utterances of spontaneous speech from four speakers were analyzed. Behaviors in spontaneous speech include not only the activity similar to that observed for read utterances but also additional types of behaviors. Some without‐breath pauses were observed to have regions of oscillation in net effort. Such behavior appears to be more common during longer pauses and during hesitations. Secondly, some without‐breath pauses were generated with a reversed pattern—sharp movement toward...
Journal of the Acoustical Society of America | 2006
Helen M. Hanson; Janet Slifka; Stefanie Shattuck-Hufnagel; James B. Kobler
The subglottal pressure contour Ps for speech is considered to have three phases: initiation (rapid rise), working (level or slightly declining), and termination (rapid fall). The current work focuses on characterization of the working phase in terms of the distribution of pitch accents and of phrase and boundary tones. In particular, the degree of Ps declination is studied. A measure of Ps declination has proven difficult to define [cf. Strik and Boves, J. Phonetics 23, 203–220 (1995)]. Therefore, in pilot work, subjective ratings of degree of declination are made on a subset of a corpus in which the tone distribution is controlled. Significant variation in the degree of declination is observed among speakers. For example, Ps for one speaker is relatively constant to slightly declining, while for another it is almost always sharply declining. For some speakers utterances with an early‐occurring nuclear pitch accent (NPA) show greater degree of declination than utterances with a late‐occurring NPA, as do ...
Journal of the Acoustical Society of America | 2006
Janet Slifka; Kushan Surana
Quantification of the acoustic characteristics of irregular phonation provides a foundation for automatic detection of regions of irregular phonation in continuous speech. Recent results for automatic classification of regions of phonation as either regular or irregular demonstrate classification rates greater than 90% (false positive <10%) [K. Surana, M.Eng. thesis, MIT, Cambridge, MA 2006]. Similar acoustic cues may be useful in separating subtypes of irregular phonation. Two types of irregular phonation are examined: (1) regions characterized by reduced airflow, assumed to correspond to tightly adducted vocal folds with brief regions of separation, and (2) regions characterized by increased airflow, assumed to correspond to a spread or spreading vocal‐fold configuration [J. Slifka, J.Voice (in press)]. Reduced‐airflow tokens are extracted, using airflow, audio, and electroglottography signals, from utterance‐medial locations, and increased‐airflow tokens are all utterance final (20 tokens/speaker, 4 sp...
Journal of the Acoustical Society of America | 2006
Chiyoun Park; Janet Slifka
Algorithms for consonant landmark detection, such as Liu [S. Liu, J. Acoust. Soc. Am. 100, 3417–3430 (1996)], extract cues to specific types of abruptnesses in the acoustics. The abruptnesses indicate occurences of closure and release for obstruent and sonorant consonants, and burst release for stop consonants. In Liu’s algorithm, fixed thresholds are used to filter out abruptnesses that are unlikely to be true landmarks. The resulting set of landmarks does not retain any information regarding these filtered‐out instances. However, such information may be useful later in the lexical access process, especially given the range of contextual variation in the speech signal. In this work, the landmark detection process is reformulated as a probabilistic system. First, thresholds are lowered to include more candidates, and then a probability value is calculated for each candidate. An N‐best search is used to pick the most likely sequences of obstruent landmarks based on the calculated probabilities. Experiments with 80 sentences from the TIMIT database detect corresponding landmarks within 40‐ms windows of 96% of hand‐labeled obstruent landmarks, 98% of burst‐release landmarks, and 76% of sonorant landmarks. Applying 5‐best search results in 9% deletion and 4% insertion rate. [Work supported by NIH DC02978.]
Journal of the Acoustical Society of America | 2005
Helen M. Hanson; Janet Slifka; Stefanie Shattuck-Hufnagel; James B. Kobler
Subglottal pressure (Ps) contours for speech are described as having three phases: initial rise, constant or declining working phase, and final fall. The current work is part of a project to relate characteristics of the Ps contour to prosodic events. To that end, one must identify the three phases in a Ps contour. In past work, it was found that the initial phase is relatively easy to identify, but the transition from the working phase to final fall is less clear [J. Slifka (2000)]. Confounding issues could include segmental impedance, pitch accents, and phrase and boundary tones, all of which can have local effects on Ps. In this work, it is attempted to control tones and segments at the ends of utterances in order to better identify final fall. Lung pressure is estimated from esophageal pressure (corrected for lung volume). Pilot data from one subject indicate that the beginning of final fall is easier to identify when the phrase and boundary tones are low than when they are high. Results will be prese...
Journal of Voice | 2006
Janet Slifka
Archive | 2006
Kushan Surana; Janet Slifka
conference of the international speech communication association | 2006
Kushan Surana; Janet Slifka