James Sneed German
Nanyang Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James Sneed German.
Journal of Phonetics | 2013
James Sneed German; Katy Carlson; Janet B. Pierrehumbert
Abstract In an experiment spanning a week, American English speakers imitated a Glaswegian (Scottish) English speaker. The target sounds were allophones of /t/ and /r/, as the Glaswegian speaker aspirated word-medial /t/ but pronounced /r/ as a flap initially and medially. This experiment therefore explored (a) whether speakers could learn to reassign a sound they already produce (flap) to a different phoneme, and (b) whether they could learn to reliably produce aspirated /t/ in an unusual phonological context. Speakers appeared to learn systematically, as they could generalize to words which they had never heard the Glaswegian speaker pronounce. The pattern for /t/ was adopted and generalized with high overall reliability (96%). For flap, there was a mix of categorical learning, with the allophone simply switching to a different use, and parametric approximations of the “new” sound. The positional context was clearly important, as flaps were produced less successfully when word-initial. And although there was variability in success rates, all speakers learned to produce a flap for /r/ at least some of the time and retained this learning over a weeks time. These effects are most easily explained in a hybrid of neo-generative and exemplar models of speech perception and production.
IEEE Transactions on Multimedia | 2014
Talal Bin Amin; Pina Marziliano; James Sneed German
Voice impersonators possess a flexible voice which allows them to imitate and create different voice identities. These impersonations present a challenge for forensic analysis and speaker identification systems. To better understand the phenomena underlying successful voice impersonation, we collected a database of synchronous speech and ElectroGlottoGraphic (EGG) signals from three voice impersonators each producing nine distinct voice identities. We analyzed glottal and vocal tract measures including F0, speech rate, vowel formant frequencies, and timing characteristics of the vocal folds. Our analysis confirmed that the impersonators modulated all four parameters in producing the voices, and provides a lower bound on the scale of variability that is available to impersonators. Importantly, vowel formant differences across voices were highly dependent on vowel category, showing that such effects cannot be captured by global transformations that ignore the linguistic parse. We address this issue through the development of a no-reference objective metric based on the vowel-dependent variance of the formants associated with each voice. This metric both ranks the impersonators natural voices highly, and correlates strongly with the results of a subjective listening test. Together, these results demonstrate the utility of voice variability data for the development of voice disguise detection and speaker identification applications.
Journal of the Acoustical Society of America | 2013
Talal Bin Amin; James Sneed German; Pina Marziliano
This paper describes results of an experiment to conduct Text Independent Speaker Identification of large number of speakers (about 100) using a standard vocabulary of about 23 NATO wordssuch as Alfa, Bravo, etc. These words in isolation were spoken in a sound treated room by Hindi natives having very good education in English ( both male and female) and recorded by a three channel data recording system-the cardioid microphone, electret condenser microphone and a NOKIA mobile telephone. The pre-processed digitized database of isolated words was further processed to determine 39 MFCCs and their derivatives and used to build an HMM model for each speaker based on all the words. The HMM model was trained using an HTK tool kit to generate the model parameters and tested using Viterbi algorithm. The identification of speakers was done in a closed set manner, based on comparison of each NATO word in the model. In addition to correct identification, false acceptance and false rejection scores were also found. The results show varying performance due to variations in channels, male/female speakers. The overall identification scores vary between 60% to 70% .The paper gives detailed analysis of results.
Language and Speech | 2016
James Sneed German; Mariapaola D'Imperio
This study addresses the relationship between information structure and intonation in French. Using an interactive speech production experiment, it tests the hypothesis that the French initial rise (LHi) is used to mark the left edge of a contrastively focused constituent. Since the occurrence of the initial rise is also known to be sensitive to the length of an Accentual Phrase (AP), AP length was manipulated within the same experiment in a 2 × 2 design. This made it possible to explore the issue of whether the initial rise represents a true marker of focus in the traditional sense, or whether the association is less direct. The results show that focus and phrase length make contributions to the distribution of the initial rise, but with no interaction. It is argued that these findings are incompatible with a model that assumes a direct mapping between focus and the initial rise, and that the relatively weak association can nevertheless be informative in a model of interpretation that integrates multiple probabilistic inputs to initial rise occurrence. These findings represent the first quantitative experimental assessment of focus realization in French in a non-corrective context, and establish a previously undocumented link between the initial rise and discourse-level meaning.
biomedical circuits and systems conference | 2015
Sai Praveen Kadiyala; Aritra Sen; Shubham Mahajan; Qingyun Wang; Avinash Lingamneni; James Sneed German; Xu Hong; Ansuman Banerjee; Krishna V. Palem; Arindam Basu
Inexact design has been recognized as very viable approach to achieve significant gains in the energy, area and speed efficiencies of digital circuits. By deliberately trading error in return for such these gains, inexact circuits and architectures have been shown to be especially useful in contexts where our senses such as sight and hearing, can compensate for the loss in accuracy. It is therefore important to understand, characterize the manner in which our sensorial systems interact and compensate for the loss in accuracy. Further use this knowledge to optimize and guide the manner in which inexactness is introduced. For the first time, we achieve both of these goals in this paper in the context of human audition-specifically, using the architecture of a hearing-aid and the DSP primitive of an FIR filter as our candidate. Our algorithms for designing an inexact hearing-aid thus use intelligibility as the metric. The resulting inexact FIR filter in the hearing aid is 1.5X or 1.8X more efficient in terms of power-area product while producing 5% or 10% less intelligible speech respectively when compared with the corresponding exact version.1
international conference on multimedia and expo | 2012
Talal Bin Amin; Pina Marziliano; James Sneed German
Voice impersonators possess a flexible voice and thus can change their voice identity. They are able to imitate various people and characters which differ in age, gender, accent and voice quality. State of the art electronic voice conversion systems are not able to successfully mimic their human counterparts as they lack naturalness. To understand why human impersonators are successful and what parameters they rely on to change their voice, we analyze nine voices produced by a professional voice impersonator. We compute different acoustical measures and discuss their linguistic implications. The acoustical measures include pitch, speech rate and formant frequencies. Our results show that differences in the voice identity features such as age and gender are reflected in the acoustic parameters of the impersonations. The analysis is distinguished from previous studies on impersonators in giving full consideration to voice identity features.
Language, games, and evolution | 2011
James Sneed German; Eyal Sagi; Stefan Kaufmann; Brady Clark
In English and other languages, the distribution of nuclear pitch accents within a sentence usually reflects how the meaningful parts of the sentence relate to the context. Generally speaking, the nuclear pitch accent can only occur felicitously on focused parts of the sentence, corresponding to information that is not contextually retrievable or given. In most contemporary theories, focus is formally represented by an abstract syntactic feature ‘F’. Those parts of the sentence that are given tend to resist F-marking and thus nuclear accentuation. In short, there is a more or less tight coupling between (i) the contextual information status of parts of the sentence; (ii) the focus structure of the sentence (represented by the distribution of syntactic F-marking); and (iii) the actual accent placement in the phonological form.
Journal of the Acoustical Society of America | 2005
James Sneed German; Janet B. Pierrehumbert; Katy Carlson
This study explored speakers’ success in acquiring an unfamiliar dialect. Twenty‐four speakers of American English attempted to learn Glaswegian, a dialect in which /r/ is produced as a tap. Its closest counterpart is the flap allophone of /t/. The question posed is whether an allophone of one consonant can be remapped to serve as the realization of another consonant. Subjects were trained and tested with materials in which /r/ appeared only in the last word. Each block contained 12 initial and 12 medial /r/’s. A training block was presented twice followed by a generalization block. A week after training, subjects returned for retesting on the original training and generalizations blocks, and a new generalization block. The block order was counterbalanced. Subjects were highly successful at imitating medial /r/ (71%), a striking example of fast categorical remapping in production. Success rates were lower (46%) for /r/ in initial position, where flaps do not appear in English. Generalization and retest re...
2ème Congrès Mondial de Linguistique Française | 2010
James Sneed German; Mariapaola D'Imperio
This study addresses the relationship between information structure and prosodic form in French. More specifically, it tests whether phrase-initial accents (LHi) are associated with the left edge of contrastively focused constituents in wh-interrogatives. Since word length has also been correlated with LHi distribution (Astesano et al. 2007), the study further examines the relative contribution of constraints operating at two distinct levels: information structure and phonological structure. The results show that each set of constraints makes an independent contribution to the occurrence of LHi with no interaction. In other words, phrase-initial accents are more likely to occur on an accentual phrase when its left edge coincides with the left edge of a contrastively focused constituent, and more likely to occur on constituents with more syllables, but constituent length does not limit the extent to which phrase-initial accents mark contrast, or vice versa. By comparison, the distribution of AP phrase boundaries is not correlated with the left edge of contrastively focused constituents. The findings of this study represent the first quantitative description of focus realization in French in a non-corrective context. They establish a previously undocumented link between LHi and discourse-level meaning and have important implications for the possibility of an intermediate level of phrasing in the prosodic hierarchy.
Language | 2006
James Sneed German; Janet B. Pierrehumbert; Stefan Kaufmann