Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stephen Sutton is active.

Publication


Featured researches published by Stephen Sutton.


Journal of the Acoustical Society of America | 2010

Method and apparatus of specifying and performing speech recognition operations

Pieter Vermeulen; Robert Savoie; Stephen Sutton; F. S. Mozer

A speech recognition technique is described that has the dual benefits of not requiring collection of recordings for training while using computational resources that are cost-compatible with consumer electronic products. Methods are described for improving the recognition accuracy of a recognizer by developer interaction with a design tool that iterates the recognition data during development of a recognition set of utterances and that allows controlling and minimizing the computational resources required to implement the recognizer in hardware.


international conference on spoken language processing | 1996

Building 10,000 spoken dialogue systems

Stephen Sutton; David G. Novick; Ron Cole; Pieter Vermeulen; J.H de Villiers; Johan Schalkwyk; Mark A. Fanty

Spoken dialogue systems are not yet ubiquitous. But with an easy enough development tool, at a low enough cost, and on portable enough software, advances in spoken dialogue technology could soon enable the rapid development of 10000 or more spoken dialogue systems for a wide variety of applications. To achieve this goal, we propose a toolkit approach for research and development of spoken dialogue systems. The paper presents the CSLU toolkit which integrates spoken dialogue technology with an easy to use interface. The toolkit supports rapid prototyping, iterative design, empirical evaluation, training of specialized speech recognizers and tools for conducting research to improve the underlying technology. We describe the toolkit with an emphasis on graphical creation of spoken dialogue systems; the transition of the toolkit into the user community; and research directed toward improvements in the toolkit.


Speech Communication | 1997

Experiments with a spoken dialogue system for taking the US census

Ron Cole; David G. Novick; Pieter Vermeulen; Stephen Sutton; Mark A. Fanty; L.F.A Wessels; J.H de Villiers; Johan Schalkwyk; Brian Hansen; D Burnett

Abstract This paper reports the results of the development, deployment and testing of a large spoken-language dialogue application for use by the general public. We built an automated spoken questionnaire for the US Bureau of the Census. In the projects first phase, the basic recognizers and dialogue system were developed using 4000 calls. In the second phase, the system was adapted to meet Census Bureau requirements and deployed in the Bureaus 1995 national test of new technologies. In the third phase, we refined the system and showed empirically that an automated spoken questionnaire could successfully collect and recognize census data, and that subjects preferred the spoken system to written questionnaires. Our large data collection effort and two subsequent field tests showed that, when questions are asked correctly, the answers contain information within the desired response categories about 99% of the time.


meeting of the association for computational linguistics | 1994

AN EMPIRICAL MODEL OF ACKNOWLEDGMENT FOR SPOKEN-LANGUAGE SYSTEMS

David G. Novick; Stephen Sutton

We refine and extend prior views of the description, purposes, and contexts-of-use of acknowledgment acts through empirical examination of the use of acknowledgments in task-based conversation. We distinguish three broad classes of acknowledgments (other←ackn, self←other←ackn, and self+ackn) and present a catalogue of 13 patterns within these classes that account for the specific uses of acknowledgment in the corpus.


Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376) | 1998

Connected digit recognition experiments with the OGI Toolkit's neural network and HMM-based recognizers

Piero Cosi; John-Paul Hosom; Johan Shalkwyk; Stephen Sutton; Ronald A. Cole

This paper describes a series of experiments that compare different approaches to training a speaker-independent continuous-speech digit recognizer using the CSLU Toolkit. Comparisons are made between the hidden Markov model (HMM) and neural network (NN) approaches. In addition, a description of the CSLU Toolkit research environment is given. The CSLU Toolkit is a research and development software environment that provides a powerful and flexible tool for creating and using spoken language systems for telephone and PC applications. In particular, the CSLU-HMM, the CSLU-NN, and the CSLU-FBNN development environments, with which our experiments were implemented, are described in detail and recognition results are compared. Our speech corpus is OGI 30K-Numbers, which is a collection of spontaneous ordinal and cardinal numbers, continuous digit strings and isolated digit strings. The utterances were recorded by having a large number of people recite their ZIP code, street address, or other numeric information over the telephone. This corpus represents a very noisy and difficult recognition task. Our best results (98% word recognition, 92% sentence recognition), obtained with the FBNN architecture, suggest the effectiveness of the CSLU Toolkit in building real-life speech recognition systems.


international conference on acoustics speech and signal processing | 1998

Accessible technology for interactive systems: a new approach to spoken language research

Ronald A. Cole; Stephen Sutton; Yonghong Yan; Pieter Vermeulen; Mark A. Fanty

In this paper, we argue for a paradigm shift in spoken language technology, from transcription tasks to interactive systems. The current paradigm evaluates speech recognition technology in terms of word recognition accuracy on large vocabulary transcription tasks, such as telephone conversations or media broadcasts. Systems are evaluated in international competitions, with strict rules for participation and well-defined evaluation metrics. Participation in these competitions is limited to a few elite laboratories that have the resources to develop and field systems. We propose a new, more productive and more accessible paradigm for spoken language research, in which research advances are evaluated in the context of interactive systems that allow people to perform useful tasks, such as accessing information from the World Wide Web, while driving a car. These systems are made available for daily use by ordinary citizens through telephone networks or placement in easily accessible kiosks in public institutions. It has previously been argued that this new paradigm, which focuses on the goal of universal access to information for all people, better serves the needs of the research community, as well as the welfare of our citizens. We discuss the challenges and rewards of an interactive system approach to spoken language research, and discuss our initial attempts to stimulate a paradigm shift and engage a large community of researchers through free distribution of the CSLU toolkit.


user interface software and technology | 1997

The CSLU toolkit: rapid prototyping of spoken language systems

Stephen Sutton; Ronald A. Cole

Research and development of spoken language systems is currently limited to relatively few academic and industrial laboratories. This is because building such systems requires multidisciplinary expertise, sophisticated development tools, specialized language resources, substantial computer resources and advanced technologiessuch as speech recognitionand textto-speech synthesis. At the Center for Spoken Language Understanding (CSLU), our mission is to make spoken language systems commonplace. To do so requires that the technology become less exclusive, more affordable and more accessible. An important step towards satisfying this goal is to place the development of spoken language systems in the hands of real domain experts rather than limit it to technical specialists. To address this problem, we have developed the CSLUToolkit, an integrated software environment for research and development of telephone-based spoken language systems (Sutton et al., 1996; Schalkwyk, et al., 1997). It is designed to support a wide range of research and development activities, including data capture and analysis, corpus development, multilingual recognition and understanding, dialogue design, speech synthesis, speaker recognition and language recognition, and systemsevaluationamongothers. Inaddition, theToolkitprovides an excellent environment for learning about spoken language technology, providingopportunitiesfor hands-on leaming, exploration and experimentation. It has been used as a basis for several short courses in which students have produced a wide range of interesting spoken language applicaPermission to nlnke digitnlhrd copies ofnll or parl ofthis mnterinl for personnl or clnssroom use is granted without fee provided that IIE copies nre not made or distributed for profit or commercial ndwmtage. Ihe copyright notice, the title of the publication and its date appear, and notice is given thnt copyright is by permission ofthe ACM, Inc. To copy olherwise, to republish. Lo post oo servers or to redistribute IO lists, requires specific permission nndlor fee UIST 97 Banfl Alberta, Canada Copyright 1997 ACM 0-89791-SSl-9!97/10..


international conference on acoustics speech and signal processing | 1996

A laboratory course for designing and testing spoken dialogue systems

Don Colton; Ronald A. Cole; David G. Novick; Stephen Sutton

3.50 tions, such as voice mail, airlinereservation and browsing the worldwide web by voice (Colton et al., 1996, Sutton et al., 1997). AkeymoduleoftheToolkitisagraphicalapplication-creation environment called the CSLU Rapid Prototyper (CSLUrp). This integrates state-of-the-art speaker independent and vocabulary independent technology into an easy-to-use graphical interface. It enables spoken language applications to be developed and tested, quickly and easily. Figure 1 shows a prototype application being developed using CSLUrp. The current version of CSLUrp allows for the rapid development of structured dialogues. It is designed to require minimal technical expertise on the author’s part. It provides an intuitive window-like setting, in which applications are built by placing objects onto a canvas (e.g., a telephoneanswering object, a speech recognition object, etc.) and connecting them with simple clicks of the mouse. Specifying words or phrases to be recognized by the system is a matter of simply typing them in. Similarly, specifying what the system will speak is a matter of typing or recording it. Once an application is complete, it can be run at the press of a button and interacted with either over the telephone or in desktop setting via microphone and speaker. The capability to alternate between designing and testing an application allows for incremental development and iterative refinement of systems. CSLUrp provides non-expert and even novice users with the ability to create spoken language systems for themselves. As they become more experienced and familiar with the basic capabilities, they can move beyond the scope of CSLUrp and begin to learn about and take advantage of other modules of the CSLU Toolkit.


human language technology | 1994

Corpus development activities at the center for spoken language understanding

Ronald A. Cole; Mike Noel; Daniel C. Burnett; Mark A. Fanty; Terri Lander; Beatrice T. Oshika; Stephen Sutton

The Spoken Dialogue Systems Laboratory at OGI gives students hands-on experience developing spoken dialogue systems (SDSs) in a rapid prototyping setting. The CSLU rapid prototyper (CSLUrp) allows students to quickly build and operate SDSs for dialogues of arbitrary complexity. CSLUrp consists of a graphical user interface that allows users to create SDSs with speech recognition, speech generation and arbitrary computation. When an application is designed, CSLUrp configures the system from the appropriate libraries in CSLUsh, the CSLU shell. The course explores CSLUrp in enough depth that students can craft and integrate their own low-level components (such as special purpose recognizers). In their final projects, students successfully built and demonstrated SDSs to do a variety of interesting tasks such as voice mail and directory assistance. Students found the course helpful and rated it highly.


Archive | 1999

Limiting Factors of Automated Telephone Dialogues

David G. Novick; Brian Hansen; Stephen Sutton; Catherine R. Marshall

This paper describes eight telephone-speech corpora at various stages of development at the Center for Spoken Language Understanding. For each corpus, we describe data collection procedures, methods of soliciting callers, protocol used to collect the data, transcriptions that accompany the speech data, and the expected release date. The corpora are available at no charge to academic institutions.

Collaboration


Dive into the Stephen Sutton's collaboration.

Top Co-Authors

Avatar

David G. Novick

University of Texas at El Paso

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ronald A. Cole

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge