Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Colin W. Wightman is active.

Publication


Featured researches published by Colin W. Wightman.


Journal of the Acoustical Society of America | 1992

Segmental durations in the vicinity of prosodic phrase boundaries

Colin W. Wightman; Stefanie Shattuck-Hufnagel; Mari Ostendorf; Patti Price

Numerous studies have indicated that prosodic phrase boundaries may be marked by a variety of acoustic phenomena including segmental lengthening. It has not been established, however, whether this lengthening is restricted to the immediate vicinity of the boundary, or if it extends over some larger region. In this study, segmental lengthening in the vicinity of prosodic boundaries is examined and found to be restricted to the rhyme of the syllable preceding the boundary. By using a normalized measure of segmental lengthening, and by compensating for differences in speaking rate, it is also shown that at least four distinct types of boundaries can be distinguished on the basis of this lengthening.


IEEE Transactions on Speech and Audio Processing | 1994

Automatic labeling of prosodic patterns

Colin W. Wightman; Mari Ostendorf

This paper describes a general algorithm for labeling prosodic patterns in speech, which provides a mechanism for mapping sequences of observations (vectors of acoustic correlates) to prosodic labels using decision trees and a Markov sequence model. Important and novel features of the approach are that it allows many dissimilar correlates to be treated in a unified manner to provide more robust labeling, and that it is designed to be a post-word-recognition processing step. Application of the algorithm is illustrated with experimental results for labeling prosodic phrasing and phrasal prominence in two corpora of professionally read speech. The labels produced by the automatic algorithm exhibit agreement with hand-labeled prominence and phrasing that is close to the agreement between different human labelers. >


international conference on acoustics, speech, and signal processing | 1991

Automatic recognition of prosodic phrases

Colin W. Wightman; Mari Ostendorf

The authors report on the development of two algorithms to automatically detect prosodic phrases. The first algorithm uses simple frame-based likelihood classifiers to detect breaths and silences, which yields a large percentage of major phrase breaks. To label other levels of prosodic structure, a second algorithm is introduced that uses phoneme durations given by a speech recognizer in conjunction with a tree quantizer and hidden Markov model to label a hierarchy of prosodic phrase breaks. This second algorithm yields phrase break predictions that have good correlation with hand labels, and correctly detects more than 90% of the major phrase boundaries.<<ETX>>


Computer Speech & Language | 1993

Parse scoring with prosodic information: an analysis/synthesis approach

Mari Ostendorf; Colin W. Wightman; Nanette Veilleux

Abstract Prosody is used by human listeners to disambiguate spoken language and, in particular, the relative size and location of prosodic phrase boundaries provides an important cue for resolving syntactic ambiguity. Therefore, automatically detected prosodic phrase boundaries should provide information useful in speech understanding for choosing among several candidate parses. Here, we propose two scoring algorithms to rank candidate parses, both based on an analysis/synthesis approach that compares the recognized prosodic phrase structure (analysis) with the predicted structure (synthesis) for each candidate parse. The two scoring algorithms, one rule-based and one using a probabilistic model, yield similar overall results when evaluated in experiments with a corpus of ambiguous sentences read by FM radio announcers. To decouple the performance of the analysis and synthesis components, we have used the scoring algorithms with hand-labeled breaks, which results in disambiguation performance comparable to the performance of human subjects in perceptual experiments. Performance degrades somewhat using automatically recognized breaks.


Archive | 1997

The Aligner: Text-to-Speech Alignment Using Markov Models

Colin W. Wightman; David Talkin

Development of high-quality synthesizers is typically dependent on having a large corpus of speech that has an accurate, time-aligned, phonetic transcription. Producing such transcriptions has been difficult, slow, and expensive. Here we describe the operation and performance of a new software tool that automates much of the transcription process and requires far less training and expertise to be used successfully.


human language technology | 1989

Prosody and parsing

Patti Price; Mari Ostendorf; Colin W. Wightman

We address the role of prosody as a potential information source for the assignment of syntactic structure. We consider the perceptual role of prosody in marking syntactic breaks of various kinds for human listeners, the automatic extraction of prosodic information, and its correlation with perceptual data.


international conference on acoustics, speech, and signal processing | 1992

Automatic recognition of intonational features

Colin W. Wightman; Mari Ostendorf

The authors report the initial development of an algorithm to automatically detect boundary tones and prominences in continuous speech. Utilizing phoneme durations given by a speech recognizer, the authors use a tree quantizer and hidden Markov model to label these intonational features. In speaker-independent tests on a corpus of professionally read speech, 77% of the boundary tones were correctly detected while 3% of the detections were false alarms. For prominences, the corresponding numbers were 86% and 14%.<<ETX>>


human language technology | 1991

Use of prosody in syntactic disambiguation: an analysis-by-synthesis approach

Colin W. Wightman; N. M. Veilleuz; Mari Ostendorf

Experiments have shown that prosody is used by human listeners to disambiguate spoken language and, in particular, that the relative size and location of prosodic phrase boundaries provides a cue for resolving syntactic ambiguity. Therefore, automatically detected prosodic phrase boundaries can provide information useful in speech understanding for choosing among several candidate parses. Here, we propose a scoring algorithm to rank candidate parses based on an analysis-by-synthesis method which compares the observed prosodic phrase structure with the predicted structure for each candidate parse. In experiments with a small corpus of ambiguous sentences spoken by FM radio announcers, we have achieved disambiguation performance close to the performance of human subjects in perceptual experiments.


Journal of the Acoustical Society of America | 1993

Perception of multiple levels of prominence in spontaneous speech

Colin W. Wightman

In both read and spontaneous speech, phrasal prominences play an important role in conveying the speaker’s intent. Prominences serve both to mark important discourse‐related events in the conversation and help to resolve ambiguities at several levels. Many speech researchers report the intuition that there are several levels of prominence, that is, that some prominences are bigger than others. Nonetheless, attempts to train human labelers to mark multiple levels of prominence have not been successful: while there was agreement on the location of prominences, there was little agreement between labelers on the level to be assigned to each. Here, an alternative approach, using a panel of naive listeners to mark prominences in a corpus of spontaneous speech has been taken. Instead of marking multiple levels of prominence, a simple binary labeling was used by each labeler and the level of each prominence determined by the number of labelers marking it. In this paper, the results of this preliminary study are p...


Journal of the Acoustical Society of America | 1994

Computational aids for the study of prosody

Colin W. Wightman; David Talkin

In the past few years, corpus‐based methods of inquiry have yielded significant insights into the structure and role of prosody in human speech. Efforts are currently underway to discover the relationship between prosody and syntax and discourse, and to develop automated speech processing systems (both for synthesis and recognition) that take advantage of the information contained in prosody. These efforts, however, are critically dependent upon the availability of large speech corpora in which the relevant prosodic phenomena have been consistently transcribed. If the development of such a corpus is to be cost effective or, indeed, if prosodic cues are to be detected in automated systems, computational tools that facilitate and, where possible, automate the transcription process must be made available. In this paper, some the tools currently available will be presented and their performance and utility reviewed. In particular, a new tool for generating accurate, time‐aligned phonetic transcriptions of spo...

Collaboration


Dive into the Colin W. Wightman's collaboration.

Top Co-Authors

Avatar

Mari Ostendorf

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge