Mark Tiede
Haskins Laboratories
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mark Tiede.
Journal of Phonetics | 2014
Argyro Katsika; Jelena Krivokapic; Christine Mooshammer; Mark Tiede; Louis Goldstein
This study investigates the coordination of boundary tones as a function of stress and pitch accent. Boundary tone coordination has not been experimentally investigated previously, and the effect of prominence on this coordination, and whether it is lexical (stress-driven) or phrasal (pitch accent-driven) in nature is unclear. We assess these issues using a variety of syntactic constructions to elicit different boundary tones in an Electromagnetic Articulography (EMA) study of Greek. The results indicate that the onset of boundary tones co-occurs with the articulatory target of the final vowel. This timing is further modified by stress, but not by pitch accent: boundary tones are initiated earlier in words with non-final stress than in words with final stress regardless of accentual status. Visual data inspection reveals that phrase-final words are followed by acoustic pauses during which specific articulatory postures occur. Additional analyses show that these postures reach their achievement point at a stable temporal distance from boundary tone onsets regardless of stress position. Based on these results and parallel findings on boundary lengthening reported elsewhere, a novel approach to prosody is proposed within the context of Articulatory Phonology: rather than seeing prosodic (lexical and phrasal) events as independent entities, a set of coordination relations between them is suggested. The implications of this account for prosodic architecture are discussed.
Journal of Phonetics | 2012
Christine Mooshammer; Louis Goldstein; Hosung Nam; Scott McClure; Elliot Saltzman; Mark Tiede
This study compares the time to initiate words with varying syllable structures (V, VC, CV, CVC, CCV, CCVC). In order to test the hypothesis that different syllable structures require different amounts of time to prepare their temporal controls, or plans, two delayed naming experiments were carried out. In the first of these the initiation time was determined from acoustic recordings. The results confirmed the hypothesis but also showed an interaction with the initial segment (i.e., vowel-initial words were initiated later than words beginning with consonants, but this difference was much smaller for words starting stops compared to /l/ or /s/). Adding a coda did not affect the initiation time. In order to rule out effects of segment-specific articulatory to acoustic interval differences, a second experiment was performed in which speech movements of the tongue, the jaw and the lips were recorded by means of electromagnetic articulography. Results from initiation time, based on articulatory measurements, showed a significant syllable structure effect with VC words being initiated significantly later than CV(C) words. Only minor effects of the initial segment were found. These results can be partly explained by the amount of accumulated experience a speaker has in coordinating the relevant gesture combinations and triggering them appropriately in time.
Phonetica | 2012
Donna Erickson; Atsuo Suemitsu; Yoshiho Shibuya; Mark Tiede
This paper examines kinematic patterns of jaw opening and associated F1 values of 4 American English speakers in productions of the sentence ‘I saw five bright highlights in the sky’. Results show strong-weak jaw opening alternations during the production of the utterance, and significant correlation of F1 with jaw opening for 3 of the 4 speakers. The observed jaw opening patterns correspond to metrically generated syllable stress levels for productions of the sentence by these 4 speakers.
Journal of Phonetics | 2016
Martijn Wieling; Fabian Tomaschek; Denis Arnold; Mark Tiede; Franziska Bröker; Samuel Thiele; Simon N. Wood; R. Harald Baayen
The present study introduces articulography, the measurement of the position of tongue and lips during speech, as a promising method to the study of dialect variation. By using generalized additive modeling to analyze articulatory trajectories, we are able to reliably detect aggregate group differences, while simultaneously taking into account the individual variation across dozens of speakers. Our results on the basis of Dutch dialect data show clear differences between the southern and the northern dialect with respect to tongue position, with a more frontal tongue position in the dialect from Ubbergen (in the southern half of the Netherlands) than in the dialect of Ter Apel (in the northern half of the Netherlands). Thus articulography appears to be a suitable tool to investigate structural differences in pronunciation at the dialect level.
Journal of the Acoustical Society of America | 2012
Hosung Nam; Vikramjit Mitra; Mark Tiede; Mark Hasegawa-Johnson; Carol Y. Espy-Wilson; Elliot Saltzman; Louis Goldstein
Speech can be represented as a constellation of constricting vocal tract actions called gestures, whose temporal patterning with respect to one another is expressed in a gestural score. Current speech datasets do not come with gestural annotation and no formal gestural annotation procedure exists at present. This paper describes an iterative analysis-by-synthesis landmark-based time-warping architecture to perform gestural annotation of natural speech. For a given utterance, the Haskins Laboratories Task Dynamics and Application (TADA) model is employed to generate a corresponding prototype gestural score. The gestural score is temporally optimized through an iterative timing-warping process such that the acoustic distance between the original and TADA-synthesized speech is minimized. This paper demonstrates that the proposed iterative approach is superior to conventional acoustically-referenced dynamic timing-warping procedures and provides reliable gestural annotation for speech datasets.
Journal of Speech Language and Hearing Research | 2014
Lucie Ménard; Annie Leclerc; Mark Tiede
PURPOSE The role of vision in speech representation was investigated in congenitally blind speakers and sighted speakers by studying the correlates of contrastive focus, a prosodic condition in which phonemic contrasts are enhanced. It has been reported that the lips (visible articulators) are less involved in implementing the rounding feature for blind speakers. If the weight of visible gestures in speech representation is reduced in blind speakers, they should show different strategies to mark focus-induced prominence. METHOD Nine congenitally blind French speakers and 9 sighted French speakers were recorded while uttering sentences in neutral and contrastive focus conditions. Internal lip area, upper lip protrusion, and acoustic values (formants, fundamental frequency, duration, and intensity) were measured. RESULTS In the acoustic domain, both groups signaled focus by using comparable values of fundamental frequency, intensity, and duration. Formant values in sighted speakers were more affected by the prosodic condition. In the articulatory domain, sighted speakers significantly altered lip geometry in the contrastive focus condition compared with the neutral condition, whereas blind speakers did not. CONCLUSION These results suggest that implementation of prosodic focus is affected by congenital visual deprivation. The authors discuss how these findings can be interpreted in the framework of the perception-for-action-control theory.
Speech Communication | 2017
Vikramjit Mitra; Ganesh Sivaraman; Hosung Nam; Carol Y. Espy-Wilson; Elliot Saltzman; Mark Tiede
Studies have shown that articulatory information helps model speech variability and, consequently, improves speech recognition performance. But learning speaker-invariant articulatory models is challenging, as speaker-specific signatures in both the articulatory and acoustic space increase complexity of speech-to-articulatory mapping, which is already an ill-posed problem due to its inherent nonlinearity and non-unique nature. This work explores using deep neural networks (DNNs) and convolutional neural networks (CNNs) for mapping speech data into its corresponding articulatory space. Our speech-inversion results indicate that the CNN models perform better than their DNN counterparts. In addition, we use these inverse-models to generate articulatory information from speech for two separate speech recognition tasks: the WSJ1 and Aurora-4 continuous speech recognition tasks. This work proposes a hybrid convolutional neural network (HCNN), where two parallel layers are used to jointly model the acoustic and articulatory spaces, and the decisions from the parallel layers are fused at the output context-dependent (CD) state level. The acoustic model performs time-frequency convolution on filterbank-energy-level features, whereas the articulatory model performs time convolution on the articulatory features. The performance of the proposed architecture is compared to that of the CNN- and DNN-based systems using gammatone filterbank energies as acoustic features, and the results indicate that the HCNN-based model demonstrates lower word error rates compared to the CNN/DNN baseline systems.
Clinical Linguistics & Phonetics | 2016
Katherine M. Dawson; Mark Tiede; D. H. Whalen
ABSTRACT Quantification of tongue shape is potentially useful for indexing articulatory strategies arising from intervention, therapy and development. Tongue shape complexity is a parameter that can be used to reflect regional functional independence of the tongue musculature. This paper considers three different shape quantification methods – based on Procrustes analysis, curvature inflections and Fourier coefficients – and uses a linear discriminant analysis to test how well each method is able to classify tongue shapes from different phonemes. Test data are taken from six native speakers of American English producing 15 phoneme types. Results classify tongue shapes accurately when combined across quantification methods. These methods hold promise for extending the use of ultrasound in clinical assessments of speech deficits.
Phonetica | 2015
Arthur S. Abramson; Mark Tiede; Theraphan Luangthongkum
Mon is spoken in villages in Thailand and Myanmar. The dialect of Ban Nakhonchum, Thailand, has 2 voice registers, modal and breathy; these phonation types, along with other phonetic properties, distinguish minimal pairs. Four native speakers of this dialect recorded repetitions of 14 randomized words (7 minimal pairs) for acoustic analysis. We used a subset of these pairs in a listening test to verify the perceptual robustness of the register distinction. Acoustic analysis found significant differences in noise component, spectral slope and fundamental frequency. In a subsequent session 4 speakers were also recorded using electroglottography, which showed systematic differences in the contact quotient. The salience of these properties in maintaining the register distinction is discussed in the context of possible tonogenesis for this language.
Journal of the Acoustical Society of America | 2015
Atsuo Suemitsu; Jianwu Dang; Takayuki Ito; Mark Tiede
Articulatory information can support learning or remediating pronunciation of a second language (L2). This paper describes an electromagnetic articulometer-based visual-feedback approach using an articulatory target presented in real-time to facilitate L2 pronunciation learning. This approach trains learners to adjust articulatory positions to match targets for a L2 vowel estimated from productions of vowels that overlap in both L1 and L2. Training of Japanese learners for the American English vowel /æ/ that included visual training improved its pronunciation regardless of whether audio training was also included. Articulatory visual feedback is shown to be an effective method for facilitating L2 pronunciation learning.