Erik M. Schmidt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Erik M. Schmidt is active.

Explore More

Publication

Featured researches published by Erik M. Schmidt.

multimedia information retrieval | 2010

Feature selection for content-based, time-varying musical emotion regression

Erik M. Schmidt; Douglas Turnbull; Youngmoo E. Kim

In developing automated systems to recognize the emotional content of music, we are faced with a problem spanning two disparate domains: the space of human emotions and the acoustic signal of music. To address this problem, we must develop models for both data collected from humans describing their perceptions of musical mood and quantitative features derived from the audio signal. In previous work, we have presented a collaborative game, MoodSwings, which records dynamic (per-second) mood ratings from multiple players within the two-dimensional Arousal-Valence representation of emotion. Using this data, we present a system linking models of acoustic features and human data to provide estimates of the emotional content of music according to the arousal-valence space. Furthermore, in keeping with the dynamic nature of musical mood we demonstrate the potential of this approach to track the emotional changes in a song over time. We investigate the utility of a range of acoustic features based on psychoacoustic and music-theoretic representations of the audio for this application. Finally, a simplified version of our system is re-incorporated into MoodSwings as a simulated partner for single-players, providing a potential platform for furthering perceptual studies and modeling of musical mood.

acm multimedia | 2013

1000 songs for emotional analysis of music

Mohammad Soleymani; Micheal N. Caro; Erik M. Schmidt; Cheng-Ya Sha; Yi-Hsuan Yang

Music is composed to be emotionally expressive, and emotional associations provide an especially natural domain for indexing and recommendation in todays vast digital music libraries. But such libraries require powerful automated tools, and the development of systems for automatic prediction of musical emotion presents a myriad challenges. The perceptual nature of musical emotion necessitates the collection of data from human subjects. The interpretation of emotion varies between listeners thus each clip needs to be annotated by a distribution of subjects. In addition, the sharing of large music content libraries for the development of such systems, even for academic research, presents complicated legal issues which vary by country. This work presents a new publicly available dataset for music emotion recognition research and a baseline system. In addressing the difficulties of emotion annotation we have turned to crowdsourcing, using Amazon Mechanical Turk, and have developed a two-stage procedure for filtering out poor quality workers. The dataset consists entirely of creative commons music from the Free Music Archive, which as the name suggests, can be shared freely without penalty. The final dataset contains 1000 songs, each annotated by a minimum of 10 subjects, which is larger than many currently available music emotion dataset.

workshop on applications of signal processing to audio and acoustics | 2011

Learning emotion-based acoustic features with deep belief networks

Erik M. Schmidt; Youngmoo E. Kim

The medium of music has evolved specifically for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. But while such organization is a natural process for humans, quantifying it empirically proves to be a very difficult task, and as such no dominant feature representation for music emotion recognition has yet emerged. Much of the difficulty in developing emotion-based features is the ambiguity of the ground-truth. Even using the smallest time window, opinions on the emotion are bound to vary and reflect some disagreement between listeners. In previous work, we have modeled human response labels to music in the arousal-valence (A-V) representation of affect as a time-varying, stochastic distribution. Current methods for automatic detection of emotion in music seek performance increases by combining several feature domains (e.g. loudness, timbre, harmony, rhythm). Such work has focused largely in dimensionality reduction for minor classification performance gains, but has provided little insight into the relationship between audio and emotional associations. In this new work we seek to employ regression-based deep belief networks to learn features directly from magnitude spectra. While the system is applied to the specific problem of music emotion recognition, it could be easily applied to any regression-based audio feature learning problem.

workshop on applications of signal processing to audio and acoustics | 2015

Modeling musical rhythmatscale with the music Genome project

Matthew Prockup; Andreas F. Ehmann; Fabien Gouyon; Erik M. Schmidt; Youngmoo E. Kim

Musical meter and attributes of the rhythmic feel such as swing, syncopation, and danceability are crucial when defining musical style. However, they have attracted relatively little attention from the Music Information Retrieval (MIR) community and, when addressed, have proven difficult to model from music audio signals. In this paper, we propose a number of audio features for modeling meter and rhythmic feel. These features are first evaluated and compared to timbral features in the common task of ballroom genre classification. These features are then used to learn individual models for a total of nine rhythmic attributes covering meter and feel using an industrial-sized corpus of over one million examples labeled by experts from Pandora® Internet Radios Music Genome Project®. Linear models are shown to be powerful, representing these attributes with high accuracy at scale.

ieee-ras international conference on humanoid robots | 2012

Affective gesturing with music mood recognition

David Grunberg; Alyssa M. Batula; Erik M. Schmidt; Youngmoo E. Kim

The recognition of emotions and the generation of appropriate responses is a key component for facilitating more natural human-robot interaction. Music, often called the “language of emotions,” is a particularly useful medium for investigating questions involving the expression of emotion. Likewise, movements and gestures, such as dance, can also communicate specific emotions to human observers. We apply an efficient, causal technique for estimating the emotions (mood) from music audio to enable a humanoid to perform gestures reflecting the musical mood. We implement this system using Hubo, an adult-sized humanoid that has been used in several applications of musical robotics. Our preliminary experiments indicate that the system is able to produce dance-like gestures that are judged by human observers to match the perceived emotion of the music.

workshop on applications of signal processing to audio and acoustics | 2011

Modeling musical instrument tones as dynamic textures

Erik M. Schmidt; Raymond Migneco; Jeffrey J. Scott; Youngmoo E. Kim

In this work we introduce the concept of modeling musical instrument tones as dynamic textures. Dynamic textures are multidimensional signals, which exhibit certain temporal-stationary characteristics such that they can be modeled as observations from a linear dynamical system (LDS). Previous work in dynamic textures research has shown that sequences exhibiting such characteristics can in many cases be re-synthesized by an LDS with high accuracy. In this work we demonstrate that short-time Fourier transform (STFT) coefficients of certain instrument tones (e.g. piano, guitar) can be well-modeled under this requirement. We show that these instruments can be re-synthesized using an LDS model with high fidelity, even using low-dimensional models. In looking to ultimately develop models which can be altered to provide control of pitch and articulation, we analyze the connections between such musical qualities as articulation with linear dynamical system model parameters. Finally, we provide preliminary experiments in the alteration of such musical qualities through model re-parameterization.

creativity and cognition | 2013

Utilizing music technology as a model for creativity development in K-12 education

David S. Rosen; Erik M. Schmidt; Youngmoo E. Kim

Many students are highly engaged, motivated, and intellectually stimulated by music outside of the classroom. In 2012, the US ranked 17th among developed countries in education. A major commonality in nations outperforming the US is a deeper focus on the arts. We argue it necessary to find new ways to engage students in music education. In this initial work, we demonstrate that teaching with music technology provides an affordable point of entry for non-trained music students to express their musical sensibilities. Computer-based tools have become the standard for the music industry. We posit that music technology classes serve as an excellent environment for creative development, offering self-awareness of ones creative process, experiential flow learning, and creative thinking skills.

computer music modeling and retrieval | 2012

Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition

Erik M. Schmidt; Matthew Prockup; Jeffrey J. Scott; Brian Dolhansky; Brandon G. Morton; Youngmoo E. Kim

While the organization of music in terms of emotional affect is a natural process for humans, quantifying it empirically proves to be a very difficult task. Consequently, no acoustic feature or combination thereof has emerged as the optimal representation for musical emotion recognition. Due to the subjective nature of emotion, determining whether an acoustic feature domain is informative requires evaluation by human subjects. In this work, we seek to perceptually evaluate two of the most commonly used features in music information retrieval: mel-frequency cepstral coefficients and chroma. Furthermore, to identify emotion-informative feature domains, we explore which musical features are most relevant in determining emotion perceptually, and which acoustic feature domains are most variant or invariant to those changes. Finally, given our collected perceptual data, we conduct an extensive computational experiment for emotion prediction accuracy on a large number of acoustic feature domains, investigating pairwise prediction both in the context of a general corpus as well as in the context of a corpus that is constrained to contain only specific musical feature transformations.

Archive | 2010