Debra Yarrington
Alfred I. duPont Hospital for Children
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Debra Yarrington.
conference on computers and accessibility | 2005
Debra Yarrington; Christopher A. Pennington; John Gray; H. Timothy Bunnell
We will be demonstrating the ModelTalker Voice Creation System, which allows users to create a personalized synthetic voice with an unrestricted vocabulary. The system includes a tool for recording a speech inventory and a program that converts the recorded inventory into a synthetic voice for the ModelTalker TTS engine. The entire system can be downloaded for use on a home PC or in a clinical setting, and the resulting synthetic voices can be used with any SAPI compliant system.We will demonstrate the recording process, and convert the recordings to a mini-database with a limited vocabulary for participants to hear.
meeting of the association for computational linguistics | 2008
Debra Yarrington; John Gray; Christopher A. Pennington; H. Timothy Bunnell; Allegra Cornaglia; Jason Lilley; Kyoko Nagao; James B. Polikoff
We will demonstrate the ModelTalker Voice Recorder (MT Voice Recorder) -- an interface system that lets individuals record and bank a speech database for the creation of a synthetic voice. The system guides users through an automatic calibration process that sets pitch, amplitude, and silence. The system then prompts users with both visual (text-based) and auditory prompts. Each recording is screened for pitch, amplitude and pronunciation and users are given immediate feedback on the acceptability of each recording. Users can then rerecord an unacceptable utterance. Recordings are automatically labeled and saved and a speech database is created from these recordings. The systems intention is to make the process of recording a corpus of utterances relatively easy for those inexperienced in linguistic analysis. Ultimately, the recorded corpus and the resulting speech database is used for concatenative synthetic speech, thus allowing individuals at home or in clinics to create a synthetic voice in their own voice. The interface may prove useful for other purposes as well. The system facilitates the recording and labeling of large corpora of speech, making it useful for speech and linguistic research, and it provides immediate feedback on pronunciation, thus making it useful as a clinical learning tool.
Journal of the Acoustical Society of America | 2000
H. Timothy Bunnell; Debra Yarrington; James B. Polikoff
Digital recordings of children producing the names ‘‘Rhonda’’ and ‘‘Wanda,’’ and/or ‘‘Toto’’ and ‘‘Coco’’ were made using the microphone input to a Toshiba laptop computer (16‐bit samples, 22 050‐kHz sampling rate) with an AKG C410/B head‐mounted condenser microphone. These names were associated with animated characters in a mock video game running on the laptop under the control of a Speech Language Pathologist. The children, ranging in age from four to six years, were undergoing speech therapy at the Alfred I. duPont Hospital for Children for one or both of two common articulation errors: /w/ substituted for /r/; and/or /t/ substituted for /k/. The initial segment in each recorded utterance was classified by laboratory staff as either r/w or t/k, and assigned a goodness rating. Discrete Hidden Markov phoneme Models (DHMMs) trained using data recorded from normally articulating children were then used to classify the same utterances and results of the automatic classification were compared to the huma...
Journal of the Acoustical Society of America | 1998
H. Timothy Bunnell; Stephen R. Hoskins; Debra Yarrington
Natural productions of the sentence ‘‘Bob bought Bogg’s box’’ in which focus was varied over each of the four words of the sentence were altered to produce prosodic cue neutralized versions. The alterations were applied singly and in all possible combinations to form eight experimental versions of each original sentence (the original and seven cue‐neutralized versions). The original sentences as spoken by eight talkers and their cue‐neutralized versions were presented to listeners with the task of identifying the focused item in the sentence. Results indicated that (a) overall, F0 cues were more important than either amplitude or duration cues in signaling focus, (b) the importance of amplitude and duration cues was greatly enhanced when F0 cues were neutralized, and (c) in many cases, identification of focus remained above chance after all three acoustic features were neutralized. The present study reports analyses of the residual acoustic features which continue to convey focus when amplitude, duration,...
Journal of the Acoustical Society of America | 1997
H. Timothy Bunnell; Steven R. Hoskins; Debra Yarrington
F0, amplitude, and durational cues are considered the primary acoustic correlates of focus or sentence‐level stress. However, questions remain regarding: (a) the degree to which each of these cues are necessary or sufficient for signaling focus; (b) the relative importance of each of these cues; and (c) how they interact in the perception of focus. Natural productions of the sentence ‘‘Bob bought Bogg’s box,’’ in which focus was varied over each of the four words of the sentence were altered to produce prosodic cue neutralized versions. The alterations were applied singly and in all possible combinations to form eight experimental versions of each original sentence (the original and seven cue‐neutralized versions). The sentences were presented to listeners with the task of identifying the focused item in the sentence. Results indicated that: (a) neutralizing any acoustic cue produced some degradation in performance, but even with all cues nullified, performance remained above chance for at least some word...
Journal of the Acoustical Society of America | 1997
Debra Yarrington; Steven R. Hoskins; H. Timothy Bunnell
A method has been developed for automatically extracting diphone speech segments with context‐dependent boundaries. When compared with speech synthesized from manually extracted diphone speech segments, it was found that speech synthesized from the automatically extracted segments was, overall, slightly less intelligible but slightly more natural sounding [Yarrington et al., ‘‘Robust automatic extraction of diphones with variable boundaries,’’ in EUROSPEECH ’95, 4th European Conference on Speech Communication and Technology, Vol. 3, pp. 1845–1848, Madrid, Spain (1995)]. The lower intelligibility appeared to be due to a small number of very poor diphone segments. While it is feasible to correct this problem by manually replacing misleading diphones, several changes have been made to the extraction procedure to eliminate or at least reduce the frequency of occurrence of incorrect diphones. In particular, a different spectral measure is being used for estimates of spectral similarity, and F0 plus a spectral ...
conference of the international speech communication association | 2000
H. Timothy Bunnell; Debra Yarrington; James B. Polikoff
conference of the international speech communication association | 2005
H. Timothy Bunnell; Christopher A. Pennington; Debra Yarrington; John Gray
SSW | 1998
H. Timothy Bunnell; Steven R. Hoskins; Debra Yarrington
conference of the international speech communication association | 1995
Debra Yarrington; H. Timothy Bunnell; Gene Ball