Eric D. Scheirer
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eric D. Scheirer.
Journal of the Acoustical Society of America | 1998
Eric D. Scheirer
A method is presented for using a small number of bandpass filters and banks of parallel comb filters to analyze the tempo of, and extract the beat from, musical signals of arbitrary polyphonic complexity and containing arbitrary timbres. This analysis is performed causally, and can be used predictively to guess when beats will occur in the future. Results in a short validation experiment demonstrate that the performance of the algorithm is similar to the performance of human listeners in a variety of musical situations. Aspects of the algorithm are discussed in relation to previous high-level cognitive models of beat tracking.
Proceedings of the IEEE | 1998
Barry Vercoe; William G. Gardner; Eric D. Scheirer
Structured audio representations are semantic and symbolic descriptions that are useful for ultralow-bit-rate transmission, flexible synthesis, and perceptually based manipulation and retrieval of sound. We present an overview of techniques for transmitting and synthesizing sound represented in structured format, and for creating structured representations from audio waveforms. We discuss applications for structured audio in virtual environments, music synthesis, gaming, content-based retrieval, interactive broadcast, and other multimedia contexts.
IEEE Transactions on Multimedia | 1999
Eric D. Scheirer; Riitta Väänänen; Jyri Huopaniemi
We present an overview of the AudioBIFS system, part of the Binary Format for Scene Description (BIFS) tool in the MPEG-4 International Standard. AudioBIFS is the tool that integrates the synthetic and natural sound coding functions in MPEG-4. It allows the flexible construction of soundtracks and sound scenes using compressed sound, sound synthesis, streaming audio, interactive and terminal-dependent presentation, three-dimensional (3-D) spatialization, environmental auralization, and dynamic download of custom signal-processing effects algorithms. MPEG-4 sound scenes are based on a model that is a superset of the model in VRML 2.0, and we describe how MPEG-4 is built upon VRML and the new capabilities provided by MPEG-4. We discuss the use of structured audio orchestra language, the MPEG-4 SAOL, for writing downloadable effects, present an example sound scene built with AudioBIFS, and describe the current state of implementations of the standard.
workshop on applications of signal processing to audio and acoustics | 1997
Eric D. Scheirer
A comparison of two models for processing sound is presented: the perceptually-based pitch model of Meddis and Hewitt (1991), and a vocoder model for rhythmic analysis by Scheirer. Similarities in the methods are noted, and it is demonstrated that the pitch model is also adequate for extracting the tempo of acoustic signals. The implications of this finding for perceptual models and signal processing systems are discussed.
IEEE Transactions on Speech and Audio Processing | 2001
Eric D. Scheirer
Structured-audio techniques are a development in audio coding that develop new connections between the existing practices of audio synthesis and audio compression. A theoretical basis for this coding model is presented, grounded in information theory and Kolmogorov complexity theory. It is demonstrated that algorithmic structured audio can provide higher compression ratios than other techniques for many audio signals and proved rigorously that it can provide compression at least as good as every other technique (up to a constant term) for every audio signal. The MPEG-4 structured audio standard is the first practical application of algorithmic coding theory. It points the direction toward a new paradigm of generalized audio coding, in which structured-audio coding subsumes all other audio-coding techniques. Generalized audio coding offers new marketplace models that enable advances in compression technology to be rapidly leveraged toward the solution of problems in audio coding.
workshop on applications of signal processing to audio and acoustics | 1999
Eric D. Scheirer
The application of a new technique for sound-scene analysis to the segmentation of complex musical signals is presented. This technique operates by discovering common modulation behavior among groups of frequency subbands in the autocorrelogram domain. The algorithm can be demonstrated to locate perceptual events in time and frequency when it is executed on ecological music examples taken directly from compact disc recordings. It operates within a strict probabilistic framework, which makes it convenient to incorporate into a larger signal-understanding testbed. Only within-channel dynamic signal behavior is used to locate events; therefore, the model stands as a theoretical alternative to methods that use pitch as their primary grouping cue. This segmentation algorithm is one processing element to be included in the construction of music perception systems that understand sound without attempting to separate it into components.
Signal Processing-image Communication | 2000
Eric D. Scheirer; Youngjik Lee; Jae-Woo Yang
In addition to its sophisticated audio-compression capabilities, MPEG-4 contains extensive functions supporting synthetic sound and the synthetic/natural hybrid coding of sound. We present an overview of the Structured Audio format, which allows efficient transmission and client-side synthesis of music and sound effects. We also provide an overview of the Text-to-Speech Interface, which standardizes a single format for communication with speech synthesizers. Finally, we present an overview of the AudioBIFS portion of the Binary Format for Scene Description, which allows the description of hybrid soundtracks, 3-D audio environments, and interactive audio programming. The tools provided for advanced audio functionality in MPEG-4 are a new and important addition to the world of audio standards.
Journal of the Acoustical Society of America | 2000
Eric D. Scheirer
When human listeners are confronted with musical stimuli, they rapidly and automatically orient themselves in the sound and begin to make semantic judgments about it. This ability—which might be termed perception of the musical surface—includes the perception of features such as genre, emotional content, and high‐level descriptions. There is relatively little known today about the exact perceptual properties that listeners associate with actual musical sounds, or about the physical correlates of these properties. A preliminary study examined the perceptions of 40 musician and nonmusician subjects in response to 150 complex musical stimuli, each 5 s long, taken directly from a popular Internet music database site. Each listener rated each stimulus on six semantic scales: simple–complex, loud–soft, fast–slow, soothing–annoying, interesting–boring, and familiar–unfamiliar. Experimental results indicate that these ratings are made consistently between listeners and between segments of the same song. A psychophysical model, based on a pattern‐recognition framework applied to a subband periodicity representation (the autocorrelogram), was able to predict a significant amount of the variance in responses on the first four scales. The regression weights and the psychophysical model suggest possible physical correlates that may underlie these semantic judgments.
Journal of the Acoustical Society of America | 2000
Eric D. Scheirer
Research into the use of perceptual models to compress sounds has achieved fruitful success in recent years. However, quantized filterbanks and linear‐predictive coding are not the only techniques that can be used for audio compression. Recent research has tightened the connection between sound processing normally conceived as ‘‘compression’’ and that normally conceived as ‘‘synthesis.’’ By making this connection explicit, it becomes possible to apply methods taken from the broad literature on the synthesis of sound signals to applications in sound compression [B. L. Vercoe et al., Proc. IEEE 85, 622–640 (1998)]. New techniques that use audio synthesis to enable compression have recently emerged. One of these is termed algorithmic structured audio, and builds on research into software synthesis and sound‐description languages. A particular implementation of this method forms the new MPEG‐4 Structured Audio standard, in which the software‐synthesis language SAOL is used to represent sound algorithmically. ...
Archive | 2000
Eric D. Scheirer; Barry Vercoe