Is this you? Create Your Porfile

Douglas Turnbull

University of California, San Diego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Douglas Turnbull is active.

Explore More

Publication

Featured researches published by Douglas Turnbull.

IEEE Transactions on Audio, Speech, and Language Processing | 2008

Semantic Annotation and Retrieval of Music and Sound Effects

Douglas Turnbull; Luke Barrington; David A. Torres; Gert R. G. Lanckriet

We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. We consider the related tasks of content-based audio annotation and retrieval as one supervised multiclass, multilabel problem in which we model the joint probability of acoustic features and words. We collect a data set of 1700 human-generated annotations that describe 500 Western popular music tracks. For each word in a vocabulary, we use this data to train a Gaussian mixture model (GMM) over an audio feature space. We estimate the parameters of the model using the weighted mixture hierarchies expectation maximization algorithm. This algorithm is more scalable to large data sets and produces better density estimates than standard parameter estimation techniques. The quality of the music annotations produced by our system is comparable with the performance of humans on the same task. Our ldquoquery-by-textrdquo system can retrieve appropriate songs for a large number of musically relevant words. We also show that our audition system is general by learning a model that can annotate and retrieve sound effects.

multimedia information retrieval | 2010

Feature selection for content-based, time-varying musical emotion regression

Erik M. Schmidt; Douglas Turnbull; Youngmoo E. Kim

In developing automated systems to recognize the emotional content of music, we are faced with a problem spanning two disparate domains: the space of human emotions and the acoustic signal of music. To address this problem, we must develop models for both data collected from humans describing their perceptions of musical mood and quantitative features derived from the audio signal. In previous work, we have presented a collaborative game, MoodSwings, which records dynamic (per-second) mood ratings from multiple players within the two-dimensional Arousal-Valence representation of emotion. Using this data, we present a system linking models of acoustic features and human data to provide estimates of the emotional content of music according to the arousal-valence space. Furthermore, in keeping with the dynamic nature of musical mood we demonstrate the potential of this approach to track the emotional changes in a song over time. We investigate the utility of a range of acoustic features based on psychoacoustic and music-theoretic representations of the audio for this application. Finally, a simplified version of our system is re-incorporated into MoodSwings as a simulated partner for single-players, providing a potential platform for furthering perceptual studies and modeling of musical mood.

IEEE Transactions on Knowledge and Data Engineering | 2005

Fast recognition of musical genres using RBF networks

Douglas Turnbull; Charles Elkan

This paper explores the automatic classification of audio tracks into musical genres. Our goal is to achieve human-level accuracy with fast training and classification. This goal is achieved with radial basis function (RBF) networks by using a combination of unsupervised and supervised initialization methods. These initialization methods yield classifiers that are as accurate as RBF networks trained with gradient descent (which is hundreds of times slower). In addition, feature subset selection further reduces training and classification time while preserving classification accuracy. Combined, our methods succeed in creating an RBF network that matches the musical classification accuracy of humans. The general algorithmic contribution of this paper is to show experimentally that RBF networks initialized with a combination of methods can yield good classification performance without relying on gradient descent. The simplicity and computational efficiency of our initialization methods produce classifiers that are fast to train as well as fast to apply to novel data. We also present an improved method for initializing the k-means clustering algorithm, which is useful for both unsupervised and supervised initialization methods.

international conference on acoustics, speech, and signal processing | 2007

Audio Information Retrieval using Semantic Similarity

Luke Barrington; Antoni B. Chan; Douglas Turnbull; Gert R. G. Lanckriet

We improve upon query-by-example for content-based audio information retrieval by ranking items in a database based on semantic similarity, rather than acoustic similarity, to a query example. The retrieval system is based on semantic concept models that are learned from a training data set containing both audio examples and their text captions. Using the concept models, the audio tracks are mapped into a semantic feature space, where each dimension indicates the strength of the semantic concept. Audio retrieval is then based on ranking the database tracks by their similarity to the query in the semantic space. We experiment with both semantic- and acoustic-based retrieval systems on a sound effects database and show that the semantic-based system improves retrieval both quantitatively and qualitatively.

international acm sigir conference on research and development in information retrieval | 2009

Combining audio content and social context for semantic music discovery

Douglas Turnbull; Luke Barrington; Gert R. G. Lanckriet; Mehrdad Yazdani

When attempting to annotate music, it is important to consider both acoustic content and social context. This paper explores techniques for collecting and combining multiple sources of such information for the purpose of building a query-by-text music retrieval system. We consider two representations of the acoustic content (related to timbre and harmony) and two social sources (social tags and web documents). We then compare three algorithms that combine these information sources: calibrated score averaging (CSA), RankBoost, and kernel combination support vector machines (KC-SVM). We demonstrate empirically that each of these algorithms is superior to algorithms that use individual information sources.

Proceedings of the National Academy of Sciences of the United States of America | 2012

Game-powered machine learning

Luke Barrington; Douglas Turnbull; Gert R. G. Lanckriet

Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

international conference on acoustics, speech, and signal processing | 2010

Beat-Sync-Mash-Coder: A web application for real-time creation of beat-synchronous music mashups

Garth Griffin; Youngmoo E. Kim; Douglas Turnbull

We present the Beat-Sync-Mash-Coder,1 a new tool for semi-automated real-time creation of beat-synchronous music mashups. We combine phase vocoder and beat tracker technology to automate the task of synchronizing clips. Freeing the user from this task allows us to replace the traditional audio editing paradigm of the Digital Audio Workstation with an intuitive clip selection interface. The application is completely web-based and operates in the ubiquitous cross-platform Flash framework. The efficiency of our implementation is reflected in performance tests, which demonstrate that the system can sustain real-time phase vocoding of 5-9 simultaneous audio signals on consumer-level hardware. This allows the user to easily create dynamic, intricate and musically coherent acoustic soundscapes. Based on an initial user study with 24 high school students, we also find that the Beat-Sync-Mash-Coder is engaging and can get students excited about music and technology.

Archive | 2010