Is this you? Create Your Porfile

Hector R. Javkin

University of California, Santa Barbara

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hector R. Javkin is active.

Explore More

Publication

Featured researches published by Hector R. Javkin.

Speech Communication | 2000

Assistive speech technology for persons with speech impairments

Yoshinori Yamada; Hector R. Javkin; Karen Youdelman

This paper describes a computer-based speech training system being developed for persons with speech-impairments, especially for profoundly deaf children. This system, called computer integrated speech training aid (CISTA) provides objective data and on-line diagnostic information visually, facilitates record keeping for teachers and increases student motivation. CISTA has been commercially available in Japan since 1988, mainly used in deaf schools and hospitals. CISTA has been used for some years to evaluate its efficiency and applicability in the schools for the deaf and speech clinics in rehabilitation centers in Japan and the USA. The results of two experiments have shown its effectiveness.

Journal of the Acoustical Society of America | 2000

Esophageal speech injection noise detection and rejection

Hector R. Javkin; Michael Galler; Nancy Niedzielski; Robert Boman

The present invention eliminates injection noise in speech produced by esophageal speakers. A speech input signal is digitized. One copy of the digitized signal is used for analysis and the other is passed through a gain switch to an amplifier as output. A Fast Fourier Transform and a mean value of the digitized speech input signal is calculated. The Fast Fourier Transform (FFT) is passed through a morphological filter to produce a filtered spectrum. An occurrence of injection noise is detected by calculating a derivative of the filtered spectrum and determining from the mean value and the derivative a location and value of a largest peak and a second largest peak in the filtered spectrum. If the largest peak is lower in frequency than the second largest peak, and if all points above 2 KHz are less than the mean, then an occurrence of injection noise has been detected. An occurrence of silence is detected by center-clipping the filtered spectrum and determining whether there is any energy within a sliding 10 millisecond window for a predetermined amount of time. If no energy is detected within a sliding 10 millisecond window for a predetermined amount time, then an occurrence of silence has been detected. The output speech signal is passed after the occurrence of injection noise has been detected; and is blocked following an occurrence of silence.

Speech Communication | 1991

The effects of breathy voice on intelligibility

Hector R. Javkin; Brian A. Hanson; Abigail Kaun

Abstract Breathiness is used to form linguistic contrasts in some languages, but also characterizes speakers as individuals and, to an extent, gender. The acoustic consequences of breathy phonation are varied, and separable in synthetic speech: they include the introduction of a frication component into the voice source, a raising of the relative amplitude of the first harmonic and a lowering of the overall spectral tilt. Henton and Bladon (1985) claimed that breathiness diminishes intelligibility. The experiments described in the present paper used synthetic speech to determine the effect of adding a noise source to a modal voice source and to determine the effects of the different acoustic consequences of breathiness on the intelligibility of isolated words. No significant effects were found.

Journal of the Acoustical Society of America | 2004

Systematic speaker variation and within‐speaker center of gravity correlations in the TIMIT database

Hector R. Javkin; Carol Christie; Gaston R. Cangiano; Elaine Drom; Katia McClain

The systematicity of speaker variation in consonants was examined by measuring the noise component of consonants in the TIMIT database. Fricatives and stops were compared by measuring at the temporal middle of fricatives and at the release of stops. There is a high correlation between many of the center of gravity measures. For example, if a speaker has a particularly high center of gravity in the sound /t/, she or he will also have a high center of gravity in /d/, /s/, /p/, /k/, and other consonants. The full set of correlations will be described in the paper. The correlations appear to stem from individual differences and not from dialect variations. The implications of the results for rapid speaker adaptation in speech recognition will be explored. [Work supported by San Jose State University Faculty Grant.]

Journal of the Acoustical Society of America | 1997

Characteristics of roughness in esophageal speech

Hector R. Javkin; Nancy Niedzielski; James Reed

Esophageal voice, produced by the vibration of the superior sphincter of the esophagus by persons who have been laryngectomized, is almost invariably associated with voice source roughness. Using inverse filtering of speech recorded to minimize phase distortion, the source characteristics of six esophageal speakers with different levels of proficiency were examined. With our three less proficient speakers, the roughness is clearly evident in the volume velocity waveform. Highly proficient esophageal speakers, however, although they have a voice quality which is perceived as rough, nevertheless produce a volume velocity waveform surprisingly similar to that of laryngeal speakers. The characteristics that produce this perceived roughness, other than jitter and shimmer, are difficult to detect in the volume velocity waveform. They are noise components of low amplitude and relatively high frequency, which are attenuated in the (integrated) volume velocity. They are best observed in the differentiated volume v...

Journal of the Acoustical Society of America | 1993

Speech synthesis by rule, text‐to‐speech, and aids for persons with disabilities

Hector R. Javkin

Speech synthesis by rule and its subsequent development into text‐to‐speech have progressed into both science and technology for use by the general population, particularly by persons with disabilities. This paper briefly reviews the advances in speech synthesis by rule and text‐to‐speech and then focuses on a new development—the use of text‐to‐speech in a speech training system for deaf children. The method uses the acoustic parameters that a text‐to‐speech system supplies to its formant synthesizer and converts them to pseudoarticulatory parameters equivalent to parameters measured from instruments monitoring the child’s production. For example, the relative frequencies of the nasal pole and nasal zero are converted to a ‘‘nasalization index’’ equivalent to the output of a nasal sensor. The method enables a student to type any utterance she/he wants to learn and see a representation of the articulation of that utterance that corresponds to the feedback received from instruments. Preliminary testing of t...

Journal of the Acoustical Society of America | 1988

Evidence for the 3‐Bark integration interval

Brian A. Hanson; Hector R. Javkin

It is generally accepted that sound energy within a 1‐Bark interval is integrated by the ear. Chistovich et al. [in Frontiers of Speech Communication Research, edited by Lindblom and Ohman (Academic, New York, 1979), pp. 143–157] found evidence of a larger integration interval, of approximately 3 Bark, for vowel perception. Klatt [J. Acoust. Soc. Am. Suppl. 1 77, S7 (1985)] found that listeners could distinguish between vowels whose formant differences were compensated by bandwidth differences (so that cue trading did not occur) and concluded that the 3‐Bark interval was not supported. The present paper analyzes the effect of different intervals on the center of gravity analysis described in Jaykin et al. [J. Acoust. Soc. Am. Suppl. 1 82, S81 (1988)] adapted from Chistovich and Chernova [Speech Commun. 5, 3–16 (1986)], and compares that analysis with a perceptual experiment testing the effect of harmonics on the perception of formants. The results support the hypothesis that the minimum interval is about ...

Journal of the Acoustical Society of America | 1988

Text‐to‐speech system for English and Japanese

Kenji Matsui; Noriyo Hara; Masaaki Kitano; Hector R. Javkin; Kazue Hate; Hisashi Wakita

A real‐time text‐to‐speech system for English and Japanese has been developed. This system consists of a language processing module, a phonetic acoustic processing module, and a synthesis module. Full general English and Japanese sentences can be converted to speech. The Japanese software and English software are independent except for the synthesis module. The features of this system are as follows. (1) The synthesis module is a phoneme‐based cascade‐parallel formant synthesizer with high observed intelligibility (73.5% for the 119 Japanese monosyllables). (2) This system has a 3000‐morphene English dictionary and 40 000‐word Japanese dictionary with a high‐speed search algorithm. (3) A large speech database was collected for the development of Japanese prosody rules. (4) For the precise control of pitch contour, the Fujisaki model was adopted. (5) One of the two systems developed can stand alone; the other requires a personal computer with a high‐speed DSP board. (6) In the development of this system, s...

Archive | 1989

A Multi-lingual Text-to-Speech System

Hector R. Javkin; Kame Hata; Lucio Mendes; Steven D. Pearson; Hisayo Ikuta; Abigail Kaun; Gregory DeHaan; Alan Jackson; Beatrix Zimmermann; Tracy Wise; Caroline Henton; Mertilyn Cow; Kcnji Matsui; Noriyo Hara; Masaaki Kitano; Der-Hwa Lin; Chun-Hong Lin

conference of the international speech communication association | 1990