Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Youngmoo E. Kim is active.

Publication


Featured researches published by Youngmoo E. Kim.


Journal of the Acoustical Society of America | 1998

Musical instrument identification: A pattern‐recognition approach

Keith D. Martin; Youngmoo E. Kim

A statistical pattern‐recognition technique was applied to the classification of musical instrument tones within a taxonomic hierarchy. Perceptually salient acoustic features—related to the physical properties of source excitation and resonance structure—were measured from the output of an auditory model (the log‐lag correlogram) for 1023 isolated tones over the full pitch ranges of 15 orchestral instruments. The data set included examples from the string (bowed and plucked), woodwind (single, double, and air reed), and brass families. Using 70%/30% splits between training and test data, maximum a posteriori classifiers were constructed based on Gaussian models arrived at through Fisher multiple‐discriminant analysis. The classifiers distinguished transient from continuant tones with approximately 99% correct performance. Instrument families were identified with approximately 90% performance, and individual instruments were identified with an overall success rate of approximately 70%. These preliminary analyses compare favorably with human performance on the same task and demonstrate the utility of the hierarchical approach to classification.


multimedia information retrieval | 2010

Feature selection for content-based, time-varying musical emotion regression

Erik M. Schmidt; Douglas Turnbull; Youngmoo E. Kim

In developing automated systems to recognize the emotional content of music, we are faced with a problem spanning two disparate domains: the space of human emotions and the acoustic signal of music. To address this problem, we must develop models for both data collected from humans describing their perceptions of musical mood and quantitative features derived from the audio signal. In previous work, we have presented a collaborative game, MoodSwings, which records dynamic (per-second) mood ratings from multiple players within the two-dimensional Arousal-Valence representation of emotion. Using this data, we present a system linking models of acoustic features and human data to provide estimates of the emotional content of music according to the arousal-valence space. Furthermore, in keeping with the dynamic nature of musical mood we demonstrate the potential of this approach to track the emotional changes in a song over time. We investigate the utility of a range of acoustic features based on psychoacoustic and music-theoretic representations of the audio for this application. Finally, a simplified version of our system is re-incorporated into MoodSwings as a simulated partner for single-players, providing a potential platform for furthering perceptual studies and modeling of musical mood.


workshop on applications of signal processing to audio and acoustics | 2011

Learning emotion-based acoustic features with deep belief networks

Erik M. Schmidt; Youngmoo E. Kim

The medium of music has evolved specifically for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. But while such organization is a natural process for humans, quantifying it empirically proves to be a very difficult task, and as such no dominant feature representation for music emotion recognition has yet emerged. Much of the difficulty in developing emotion-based features is the ambiguity of the ground-truth. Even using the smallest time window, opinions on the emotion are bound to vary and reflect some disagreement between listeners. In previous work, we have modeled human response labels to music in the arousal-valence (A-V) representation of affect as a time-varying, stochastic distribution. Current methods for automatic detection of emotion in music seek performance increases by combining several feature domains (e.g. loudness, timbre, harmony, rhythm). Such work has focused largely in dimensionality reduction for minor classification performance gains, but has provided little insight into the relationship between audio and emotional associations. In this new work we seek to employ regression-based deep belief networks to learn features directly from magnitude spectra. While the system is applied to the specific problem of music emotion recognition, it could be easily applied to any regression-based audio feature learning problem.


international conference on acoustics, speech, and signal processing | 2010

Beat-Sync-Mash-Coder: A web application for real-time creation of beat-synchronous music mashups

Garth Griffin; Youngmoo E. Kim; Douglas Turnbull

We present the Beat-Sync-Mash-Coder,1 a new tool for semi-automated real-time creation of beat-synchronous music mashups. We combine phase vocoder and beat tracker technology to automate the task of synchronizing clips. Freeing the user from this task allows us to replace the traditional audio editing paradigm of the Digital Audio Workstation with an intuitive clip selection interface. The application is completely web-based and operates in the ubiquitous cross-platform Flash framework. The efficiency of our implementation is reflected in performance tests, which demonstrate that the system can sustain real-time phase vocoding of 5-9 simultaneous audio signals on consumer-level hardware. This allows the user to easily create dynamic, intricate and musically coherent acoustic soundscapes. Based on an initial user study with 24 high school students, we also find that the Beat-Sync-Mash-Coder is engaging and can get students excited about music and technology.


Computer Music Journal | 2012

The problem of the second performer: Building a community around an augmented piano

Andrew McPherson; Youngmoo E. Kim

The design of a digital musical instrument is often informed by the needs of the first performance or composition. Following the initial performances, the designer frequently confronts the question of how to build a larger community of performers and composers around the instrument. Later musicians are likely to approach the instrument on different terms than those involved in the design process, so design decisions that promote a successful first performance will not necessarily translate to broader uptake. This article addresses the process of bringing an existing instrument to a wider musical community, including how musician feedback can be used to refine the instruments design without compromising its identity. As a case study, the article presents the magnetic resonator piano, an electronically augmented acoustic grand piano that uses electromagnets to induce vibrations in the strings. After initial compositions and performances using the instrument, feedback from composers and performers guided refinements to the design, laying the groundwork for a collaborative project in which six composers wrote pieces for the instrument. The pieces exhibited a striking diversity of style and technique, including instrumental techniques never considered by the designer. The project, which culminated in two concert performances, demonstrates how a new instrument can acquire a community of musicians beyond those initially involved.


human factors in computing systems | 2011

Multidimensional gesture sensing at the piano keyboard

Andrew McPherson; Youngmoo E. Kim

In this paper we present a new keyboard interface for computer music applications. Where traditional keyboard controllers report the velocity of each key-press, our interface senses up to five separate dimensions: velocity, percussiveness, rigidity, weight, and depth. These dimensions, which we identified based on the pedagogical piano literature and pilot studies with professional pianists, together present a rich picture of physical gestures at the keyboard, including information on the performers motion before, during, and after a note is played. User studies confirm that the sensed dimensions are intuitive and controllable and that mappings between gesture and sound produce novel, playable musical instruments, even for users without prior keyboard experience. The multidimensional sensing capability demonstrated in this paper is also potentially applicable to button interfaces outside the musical domain.


international conference on hybrid information technology | 2009

Creating an autonomous dancing robot

David Grunberg; Robert Ellenberg; Youngmoo E. Kim; Paul Y. Oh

A robot with the ability to dance autonomously has many potential applications, such as serving as a prototype dancer for choreographers or as a participant in stage performances with human dancers. A robot that dances autonomously must be able to extract several features from audio in real time, including tempo, beat, and style. It must also be able to produce a continuous sequence of humanlike gestures. We chose the Hitec RoboNova to use as a robot platform in our work on these problems. We have developed a beat identification algorithm that can extract the beat positions from audio in real time for multiple consecutive songs. Our RoboNova can now produce sequences of smooth gestures that are synchronized with the predicted beats and match the tempo of the audio. Our algorithm can also be easily moved to the HUBO, a large humanoid robot that can move in a very humanlike manner.


Computational Intelligence and Neuroscience | 2017

Comparison of Brain Activation during Motor Imagery and Motor Movement Using fNIRS

Alyssa M. Batula; Jesse Mark; Youngmoo E. Kim; Hasan Ayaz

Motor-activity-related mental tasks are widely adopted for brain-computer interfaces (BCIs) as they are a natural extension of movement intention, requiring no training to evoke brain activity. The ideal BCI aims to eliminate neuromuscular movement, making motor imagery tasks, or imagined actions with no muscle movement, good candidates. This study explores cortical activation differences between motor imagery and motor execution for both upper and lower limbs using functional near-infrared spectroscopy (fNIRS). Four simple finger- or toe-tapping tasks (left hand, right hand, left foot, and right foot) were performed with both motor imagery and motor execution and compared to resting state. Significant activation was found during all four motor imagery tasks, indicating that they can be detected via fNIRS. Motor execution produced higher activation levels, a faster response, and a different spatial distribution compared to motor imagery, which should be taken into account when designing an imagery-based BCI. When comparing left versus right, upper limb tasks are the most clearly distinguishable, particularly during motor execution. Left and right lower limb activation patterns were found to be highly similar during both imagery and execution, indicating that higher resolution imaging, advanced signal processing, or improved subject training may be required to reliably distinguish them.


IEEE Transactions on Learning Technologies | 2009

Collaborative Online Activities for Acoustics Education and Psychoacoustic Data Collection

Youngmoo E. Kim; Travis M. Doll; Raymond Migneco

Online collaborative game-based activities may offer a compelling tool for mathematics and science education, particularly for younger students in grades K-12. We have created two prototype activities that allow students to explore aspects of different sound and acoustics concepts: the ldquococktail party problemrdquo (sound source identification within mixtures) and the physics of musical instruments. These activities are also inspired by recent work using games to collect labeling data for difficult computational problems from players through a fun and engaging activity. Thus, in addition to their objectives as learning activities, our games facilitate the collection of data on the perception of audio and music, with a range of parameter variation that is difficult to achieve for large subject populations using traditional methods. Our activities have been incorporated into a pilot study with a middle school classroom to demonstrate the potential benefits of this platform.


workshop on applications of signal processing to audio and acoustics | 2007

Blind Sparse-Nonnegative (BSN) Channel Identification for Acoustic Time-Difference-of-Arrival Estimation

Yuanqing Lin; Jingdong Chen; Youngmoo E. Kim; Daniel D. Lee

Estimating time-difference-of-arrival (TDOA) remains a challenging task when acoustic environments are reverberant and noisy. Blind channel identification approaches for TDOA estimation explicitly model multipath reflections and have been demonstrated to be effective in dealing with reverberation. Unfortunately, existing blind channel identification algorithms are sensitive to ambient noise. This paper shows how to resolve the noise sensitivity issue by exploiting prior knowledge about an acoustic room impulse response (RIR), namely, an acoustic RIR can be modeled by a sparse-nonnegative FIR filter. This paper shows how to formulate a single-input two-output blind channel identification into a least square convex optimization, and how to incorporate the sparsity and nonnegativity priors so that the resulting optimization remains convex and can be solved efficiently. The proposed blind sparse-nonnegative (BSN) channel identification approach for TDOA estimation is not only robust to reverberation, but also robust to ambient noise, as demonstrated by simulations and experiments in real acoustic environments.

Collaboration


Dive into the Youngmoo E. Kim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge