Sheri Hunnicutt
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sheri Hunnicutt.
international conference on acoustics, speech, and signal processing | 1982
Rolf Carlson; Björn Granström; Sheri Hunnicutt
Recent advances in microprocessor, memory and signal processor technology have made it feasible to put complete speech processing equipment into a portable form. At our laboratory a higher-level programming language has been developed that is especially suitable for rule description of linguistic processes. Text-to-speech systems for several languages have been written in this framework. Now also a cross compiler has been designed for the 16-bit microprocessor MC 68000, which makes transportation of programs from our research computer trivial. The language-independent parts of the program for the microprocessor are written in efficient assembler code. Special hardware development has resulted in a portable, battery-operated unit that is capable of transforming text-to-speech at a speaking rate of 250 wpm (words per minute). This opens up the possibility of speech options on computer terminals, portable, large vocabulary talking language translators, etc. The module has been tried in several applications for people with communication handicaps.
Augmentative and Alternative Communication | 2001
Parimala Raghavendra; Elisabet Rosengren; Sheri Hunnicutt
This study describes the feasibility of using speech recognition as a text input method for speakers with different degrees of dysarthria. The project investigated two different types of speech recognition systems: Prototype Swedish DragonDictate (PSDD), a speaker-adaptive phoneme-based system, and Infovox RA, a speaker-dependent, whole-word pattern-matching system. Individuals with mild and moderate dysarthria trained and then used 45 command words to input text independently into the PSDD. The results indicated that the PSDD system adapted well to the speech of individuals with mild and moderate dysarthria, but the recognition scores were lower than for a natural speaker. The PSDD system also adapted to the speech of two participants with different degrees of severe dysarthria, but they were unable to use this system independently. On the Infovox RA system, there was a wide range in the mean recognition scores for participants with dysarthria, whereas the natural speaker reached almost 100%. The recognition score for the participant with very severe dysarthria increased substantially with an adapted vocabulary on the speaker-dependent Infovox RA system. The results are discussed in terms of factors that should be considered before selecting a suitable speech recognition system for speakers with different degrees of dysarthria.
Augmentative and Alternative Communication | 2001
Sheri Hunnicutt; Johan Carlberger
The goal of this project was to design and implement a new word predictor for Swedish that would suggest words that are more grammatically appropriate, thus presenting a lower cognitive load for users and saving significantly more keystrokes than the previous predictor. The new predictor that was designed and developed uses a probabilistic language model based on the well-established ideas of the trigram predictor for speech recognition, developed by IBM. In tests, this program has been shown to result in keystroke savings of 46% given five predictions—a substantial saving compared with the 35% savings achieved with the previous predictor.
Speech Communication | 1996
Helen M. Meng; Sheri Hunnicutt; Stephanie Seneff; Victor W. Zue
Abstract This paper describes a bi-directional letter/sound generation system based on a strategy combining data-driven techniques with a rule-based formalism. Our approach provides a hierarchical analysis of a word, including stress pattern, morphology and syllabification. Generation is achieved by a probabilistic parsing technique, where probabilities are trained from a parsed lexicon. Our training and testing corpora consisted of spellings and pronunciations for the high frequency portion of the Brown Corpus (10,000 words). The phonetic labels are augmented with markers indicating morphology and stress. We will report on two distinct grammars representing a historical perspective. Our early work with the first grammar inspired us to modify the grammar formalism, leading to greater constraint with fewer rules. We evaluated our performance on letter-to-sound generation in terms of whole word accuracy as well as phoneme accuracy. For the unseen test set, we achieved a word accuracy of 69.3% and a phoneme accuracy of 91.7% using a set of 52 distinct phonemes. While this paper focuses on letter-to-sound generation, our system is also capable of generation in the reverse direction, as reported elsewhere (Meng et al., 1994a). We believe that our formalism will be especially applicable for entering unknown words orally into a recognition system.
Advances in psychology | 1986
Sheri Hunnicutt
A lexical prediction system is a divice which could be of great help to people with certain communicative handicaps. Such a system is described. It can speed up communication considerably via speech synthesis and thereby make communication run more smoothly, and reduce some of the awkwardness which arises when lexical prediction is not used.
international conference on acoustics, speech, and signal processing | 1982
Rolf Carlson; Björn Granström; Sheri Hunnicutt
A multi-language, portable text-to-speech system has been developed at the Royal Institute of Technology in Stockholm. The system contains a formant speech synthesizer on a signal processing chip, a powerful microcomputer and a variety of text input equipment. A special attachment is a 500-symbol Bliss Board. Swedish and English Bliss-to-speech programs transform a symbol string to the corresponding well-formed sentence. Bliss symbols and spelled text can be intermixed to produce either a spoken or written message. A lexicon gives the pronunciation, part of speech and other grammatical features for each Bliss symbol on the Bliss Board. This information is used in phrase structure grammar which can be modeled by a simple ATN to delimit noun, verb and prepositional phrases. Once the phrase structure is established, phrases are marked according to a transformational analysis. Referring to these phrase features, pronouns, nouns, verbs and (in Swedish) adjectives are given correct forms and pronunciations.
Recent Research Towards Advanced Man-Machine Interface Through Spoken Language | 1996
Mats Blomberg; Rolf Carlson; Kjell Elenius; Björn Granström; Sheri Hunnicutt
Publisher Summary This chapter reports on some experiments, which are part of a long term project towards knowledge based speech recognition system, NEBULA. An extreme stand is taken in these experiments of comparing human speech to predicted pronunciations on the acoustic level with the help of straightforward pattern matching technique. The significantly better results when human references are used were not a surprise. It is well known that text-to-speech systems still need more work before they reach human quality. However, the results can be regarded as encouraging. The low levels of NEBULA explore the descriptive power of cues, and use multiple cues to analyze, classify, and segment the speech wave. The mid-portion of NEBULA is currently represented by a syntactic component of the text-to-speech system, morphological decomposition in the text-to-speech system, and a concept-to-speech system.
Journal of the Acoustical Society of America | 1988
Mats Blomberg; Rolf Carlson; Kjell Elenius; Björn Granström; Sheri Hunnicutt
A major problem in large‐vocabulary speech recognition is the collection of reference data and speaker normalization. In this paper, the use of synthetic speech is proposed as a means of handling this problem. An experimental scheme for such a speech recognition system will be described. A rule‐based speech synthesis procedure is used for generating the reference data. Ten male subjects participated in an experiment using a 26‐word test vocabulary recorded in a normal office room. The subjects were asked to read the words from a list with little instruction except to pronounce each word separately. The synthesis was used to build the references. No adjustments were done to the synthesis in this first stage. All the human speakers served better as reference than the synthesis. Differences between natural and synthetic speech have been analyzed in detail at the segmental level. Methods for updating the synthetic speech parameters from natural speech templates will be described. [This work has been supported by the Swedish Board of Technical Development.]
international conference on spoken language processing | 1996
Rolf Carlson; Sheri Hunnicutt
Gives an overview of the NLP (natural language processing) and dialog components in the Waxholm spoken dialog system. We discuss how the dialog and the natural language components are modeled from a generic and a domain-specific point of view. Dialog management based on grammatical rules and lexical semantic features is implemented in our parser. The notation to describe the syntactic rules has been expanded to cover some of our special needs to model the dialog. The parser runs with two different time scales, corresponding to the words in each utterance and to the turns in the dialog. Topic selection is accomplished based on probabilities calculated from user initiatives. Results from parser performance and topic prediction are included in this paper.
Assistive Technology | 2007
Sheri Hunnicutt; Tina Magnuson
A method of grammar-guided writing has been devised to guide graphic sign users through the construction of text messages for use in e-mail and other applications with a remote receiver. The purpose is to promote morphologically and syntactically correct sentences. The available grammatical structures in grammar-guided writing are the highest frequency phrase and sentence types in a database of tagged and parsed e-mail messages. A companion learning path allows users to begin with simple grammatical structures, advancing in steps to more complex structures as their language skills develop. In order to give further support during e-mail composition, grammar-guided writing has been augmented with two complementary text input methods. One is a quick and easy method of choosing preprogrammed messages and e-mail phrases. The other is a more traditional method of selecting graphic signs freely without grammar guidance.