Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David M. Carter is active.

Publication


Featured researches published by David M. Carter.


Computer Speech & Language | 1987

The predominance of strong initial syllables in the English vocabulary

Anne Cutler; David M. Carter

Studies of human speech processing have provided evidece for a segmentation strategy in the perception of continuous speech, whereby a word boundary is postulated, and a lexical access procedure initiated, at each metrically strong syllable. The likely success of this strategy was here estimated against the characteristics of the English vocabulary. Two computerized dictionaries were found to list approximately three times as many words beginning with strong syllables (i.e. syllables containing a full vowel) as beginning with weak syllables (i.e. syllables containing a reduced vowel). Consideration of frequency of lexical word occurrence reveals that words beginning with strong syllables occur on average more often than words beginning with weak syllables. Together, these findings motivate an estimate for everyday speech recognition that approximately 85% of lexical words (i.e. excluding function words) will begin with strong syllables. This estimate was tested against a corpus of 190 000 words of spontaneous British English conversion. In this corpus, 90% of lexical words were found to begin with strong syllables. This suggests that a strategy of postulating word boundaries at the onset of strong syllables would have a high success rate in that few actual lexical word onsets would be missed.


conference of the european chapter of the association for computational linguistics | 1989

Lexical acquisition in the Core Language Engine

David M. Carter

The SRI Core Language Engine (CLE) is a general-purpose natural language front end for interactive systems. It translates English expressions into representations of their literal meanings. This paper presents the lexical acquisition component of the CLE, which allows the creation of lexicon entries by users with knowledge of the application domain but not of linguistics or of the detailed workings of the system. It is argued that the need to cater for a wide range of types of back end leads naturally to an approach based on eliciting grammaticality judgments from the user. This approach, which has been used to define a 1200-word core lexicon of English, is described and evaluated.


Computational Linguistics | 2001

The Spoken Language Translator

Manny Rayner; David M. Carter; Pierrette Bouillon; Vassilis Digalakis; Mats Wirén

This original volume describes the Spoken Language Translator (SLT), one of the first major automatic speech translation projects. The SLT system can translate between English, French, and Swedish in the domain of air travel planning, using a vocabulary of about 1500 words, and with an accuracy of about 75%. The authors detail the language processing components, largely built on top of the SRI Core Language Engine, using a combination of general grammars and techniques that allow them to be rapidly customized to specific domains. They base speech recognition on Hidden Markov Mode technology, and use versions of the SRI DECIPHER system. This account of SLT is an essential resource for researchers interested in knowing what is achievable in spoken-language translation today.


human language technology | 1994

Combining knowledge sources to reorder N-best speech hypothesis lists

Manny Rayner; David M. Carter; Vassilios Digalakis; Patti Price

A simple and general method is described that can combine different knowledge sources to reorder N-best lists of hypotheses produced by a speech recognizer. The method is automatically trainable, acquiring information from both positive and negative examples. In experiments, the method was tested on a 1000-utterance sample of unseen ATIS data.


meeting of the association for computational linguistics | 1987

The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English

Branimir Boguraev; Ted Briscoe; John A. Carroll; David M. Carter; Claire Grover

We describe a methodology and associated software system for the construction of a large lexicon from an existing machine-readable (published) dictionary. The lexicon serves as a component of an English morphological and syntactic analyser and contains entries with grammatical definitions compatible with the word and sentence grammar employed by the analyser. We describe a software system with two integrated components. One of these is capable of extracting syntactically rich, theory-neutral lexical templates from a suitable machine-readable source. The second supports interactive and semi-automatic generation and testing of target lexical entries in order to derive a sizeable, accurate and consistent lexicon from the source dictionary which contains partial (and occasionally in-accurate) information. Finally, we evaluate the utility of the Longman Dictionary of Contemporary English as a suitable source dictionary for the target lexicon.


meeting of the association for computational linguistics | 1991

TRANSLATION BY QUASI LOGICAL FORM TRANSFER

Hiyan Alshawi; David M. Carter; Manny Rayner; Björn Gambäck

The paper describes work on applying a general purpose natural language processing system to transfer-based interactive translation. Transfer takes place at the level of Quasi Logical Form (QLF), a contextually sensitive logical form representation which is deep enough for dealing with cross-linguistic differences. Theoretical arguments and experimental results are presented to support the claim that this framework has good properties in terms of modularity, compositionality, reversibility and monotonicity.


meeting of the association for computational linguistics | 1996

Fast Parsing Using Pruning and Grammar Specialization

Manny Rayner; David M. Carter

We show how a general grammar may be automatically adapted for fast parsing of utterances from a specific domain by means of constituent pruning and grammar specialization based on explanation-based learning. These methods together give an order of magnitude increase in speed, and the coverage loss entailed by grammar specialization is reduced to approximately half that reported in previous work. Experiments described here suggest that the loss of coverage has been reduced to the point where it no longer causes significant performance degradation in the context of a real application.


Computer Speech & Language | 1989

Lexical stress and lexical discriminability: Stressed syllables are more informative, but why?

Gerry Altman; David M. Carter

Recent studies have suggested that recognition systems should concentrate their efforts on the identification of stressed syllables, as they contain disproportionately more information than do unstressed syllables. The paper investigates whether this increased informativeness may be outweighed by the informational disadvantage associated with transcribing consecutive segments within the same syllable. Phonotactic correlations between such adjacent segments suggest that the most informative transcription of a polysyllabic word may be one where reliable phonemic information is scattered across different syllables. Lexical statistics are presented which support this view. In addition, the paper considers the reasons for the increased informativeness of stressed syllables, and shows that this is because lexical stress preserves vowel distinctions (and hence information) which would otherwise be lost in lexically unstressed syllables.


Computer Speech & Language | 1987

An information-theoretic analysis of phonetic dictionary access

David M. Carter

Abstract Recent studies of English vocabulary have suggested that much of the linguistic content of the speech signal resides in stressed syllables and in broad phonetic classes corresponding to manner of articulation, both of which are comparatively easy to recognize. The implication is drawn that a promising strategy for speech recognition is to concentrate initially on these aspects of the signal, using phonotactic, lexical and (if available) higher level constraints to reduce the need for more detailed analysis. This paper argues that the evaluation criteria used to date in such studies are inappropriate, and, using a more appropriate information-theoretic approach, shows, by repeating a representative experiment, that many of the resulting claims are misleading and that there is in fact no reason to expect a recognition strategy of the type suggested to be particularly fruitful.


human language technology | 1993

A speech to speech translation system built from standard components

Manny Rayner; Hiyan Alshawi; Ivan Bretan; David M. Carter; Vassilios Digalakis; Björn Gambäck; Jaan Kaja; Jussi Karlgren; Bertil Lyberg; Stephen Pulman; Patti Price; Christer Samuelsson

This paper describes a speech to speech translation system using standard components and a suite of generalizable customization techniques. The system currently translates air travel planning queries from English to Swedish. The modular architecture is designed to be easy to port to new domains and languages, and consists of a pipelined series of processing phases. The output of each phase consists of multiple hypotheses; statistical preference mechanisms, the data for which is derived from automatic processing of domain corpora, are used between each pair of phases to filter hypotheses. Linguistic knowledge is represented throughout the system in declarative form. We summarize the architectures of the component systems and the interfaces between them, and present initial performance results.

Collaboration


Dive into the David M. Carter's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Björn Gambäck

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ivan Bretan

Swedish Institute of Computer Science

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ted Briscoe

University of Cambridge

View shared research outputs
Researchain Logo
Decentralizing Knowledge