Is this you? Create Your Porfile

Barry Arons

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Barry Arons is active.

Explore More

Publication

Featured researches published by Barry Arons.

ACM Transactions on Computer-Human Interaction | 1997

SpeechSkimmer: a system for interactively skimming recorded speech

Barry Arons

Listening to a speech recording is much more difficult than visually scanning a document because of the transient and temporal nature of audio. Audio recordings capture the richness of speech, yet it is difficult to directly browse the stored information. This article describes techniques for structuring, filtering, and presenting recorded speech, allowing a user to navigate and interactively find information in the audio domain. This article describes the SpeechSkimmer system for interactively skimming speech recordings. SpeechSkimmer uses speech-processing techniques to allow a user to hear recorded sounds quickly, and at several levels of detail. User interaction, through a manual input device, provides continuous real-time control of the speed and detail level of the audio presentation. SpeechSkimmer reduces the time needed to listen by incorporating time-compressed speech, pause shortening, automatic emphasis detection, and nonspeech audio feedback. This article also presents a multilevel structural approach to auditory skimming and user interface techniques for interacting with recorded speech. An observational usability test of SpeechSkimmer is discussed, as well as a redesign and reimplementation of the user interface based on the results of this usability test.

human factors in computing systems | 2001

The audio notebook: paper and pen interaction with structured speech

Lisa J. Stifelman; Barry Arons; Chris Schmandt

This paper addresses the problem that a listener experiences when attempting to capture information presented during a lecture, meeting, or interview. Listeners must divide their attention between the talker and their notetaking activity. We propose a new device-the Audio Notebook-for taking notes and interacting with a speech recording. The Audio Notebook is a combination of a digital audio recorder and paper notebook, all in one device. Audio recordings are structured using two techniques: user structuring based on notetaking activity, and acoustic structuring based on a talkers changes in pitch, pausing, and energy. A field study showed that the interaction techniques enabled a range of usage styles, from detailed review to high speed skimming. The study motivated the addition of phrase detection and topic suggestions to improve access to the audio recordings. Through these audio interaction techniques, the Audio Notebook defines a new approach for navigation in the audio domain.

human factors in computing systems | 1993

VoiceNotes: a speech interface for a hand-held voice notetaker

Lisa J. Stifelman; Barry Arons; Chris Schmandt; Eric A. Hulteen

VoiceNotes is an application for a voice-controlled hand-held computer that allows the creation, management, and retrieval of user-authored voice notes—small segments of digitized speech containing thoughts, ideas, reminders, or things to do. Iterative design and user testing helped to refine the initial user interface design. VoiceNotes explores the problem of capturing and retrieving spontaneous ideas, the use of speech as data, and the use of speech input and output in the user interface for a hand-held computer without a visual display. In addition, VoiceNotes serves as a step toward new uses of voice technology and interfaces for future portable devices.

IEEE Transactions on Consumer Electronics | 1984

A Conversational Telephone Messaging System

Chris Schmandt; Barry Arons

The Phone Slave is an intelligent answering machine, conversing with callers to format messages and relaying personal greetings to identified partles. Its owner can access these voice messages as well as electronic mail via speech recognition or Touch-Tones over the phone network. Access to both lncoming and outgoing messages, an on-line directory, and autodial features are also provided by a touch-sensitive color monitor.

acm conference on hypertext | 1991

Hyperspeech: navigating in speech-only hypermedia

Barry Arons

Most hypermedia systems emphasize the integration of graphics, images, video, and audio into a traditional hypertext framework. The hyperspeech system described in this paper, a speech-only hypermedia application, explores issues of navigation and system architecture in an audio environment without a visual display. The system under development uses speech recognition to maneuver in a database of digitally recorded speech segments; synthetic speech is used for control information and user feedback. In this research prototype, recorded audio interviews were segmented by topic, and hypertext-style links were added to connect logically related comments and ideas. The software architecture is data driven, with all knowledge embedded in the links and nodes, allowing the software that traverses through the network to be straightforward and concise. Several user interfaces were prototype, emphasizing different styles of speech interaction and feedback between the user and machine. In addition to the issues of navigation in a speech-only database, areas of continuing research includtx dynamically extending the database, use of audio and voice cues to indicate landmarks, and the simultaneous presentation of multiple channels of speech information.

user interface software and technology | 1993

SpeechSkimmer: interactively skimming recorded speech

Barry Arons

Skimming or browsing audio recordings is much more difficult than visually scanning a document because of the temporal nature of audio. By exploiting properties of spontaneous speech it is possible to automatically select and present salient audio segments in a time-efficient manner. Techniques for segmenting recordings and a prototype user interface for skimming speech are described. The system developed incorporates time-compressed speech and pause removal to reduce the time needed to listen to speech recordings. This paper presents a multi-level approach to auditory skimming, along with user interface techniques for interacting with the audio and providing feedback. Several time compression algorithms ami an adaptive speech detection technique are also stuntnarized.

user interface software and technology | 1992

Tools for building asynchronous servers to support speech and audio applications

Barry Arons

Distributed client/server models are becoming increasingly prevalent in multimedia systems and advanced user interface design. A multimedia application, for example, may play and record audio, use speech recognition input, and use a window system for graphical I/O. The software architecture of such a system can be simplified if the application communicates to multiple servers (e.g., audio servers, recognition servers) that each manage different types of input and output. This paper describes tools for rapidly prototyping distributed asynchronous servers and applications, with an emphasis on supporting highly interactive user interfaces, temporal media, and multi-modal I/O. The Socket Manager handles low-level connection management and device I/O by supporting a callback mechanism for connection initiation, shutdown, and for reading incoming data. The Byte Stream Manager consists of an RPC compiler and run-time library that supports synchronous and asynchronous calls, with both a programmatic interface and a telnet interface that allows the server to act as a command interpreter. This paper details the tools developed for building asynchronous servers, several audio and speech servers built using these tools, and applications that exploit the features provided by the servers.

user interface software and technology | 1995

Designing auditory interactions for PDAs

Debby Hindus; Barry Arons; Lisa J. Stifelman; Bill Gaver; Elizabeth D. Mynatt; Maribeth Back

This panel addresses issues in designing audio-based user interactions for small, personal computing devices, or PDAs. One issue is the nature of interacting with an auditory PDA and the interplay of affordances and form factors. Another issue is how both new and traditional metaphors and interaction concepts might be applied to auditory PDAs. The utility and design of nonspeech cues are discussed, as are the aesthetic issues of persona and narrative in designing sounds. Also discussed are commercially available sound and speech components and related hardware tradeoffs. Finally, the social implications of auditory interactions are explored, including privacy, fashion and novel social interactions.

user interface software and technology | 1995

Hands-on demonstration: interacting with SpeechSkimmer

Barry Arons

SpeechSkimmer is an interactive system for quickly browsing and finding information in speech recordings. Skimming speech recordings is much more difficult than visually scanning images, text, or video because of the slow, linear, temporal nature of the audio channel. The SpeechSkimmer system uses a combination of (1) time compression and pause removal, (2) automatically finding segments that summarize a recording, and (3) interaction techniques, to enable a speech recording to be heard quickly and at several levels of detail.

Archive | 1992