Bryan Pardo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bryan Pardo is active.

Explore More

Publication

Featured researches published by Bryan Pardo.

IEEE Transactions on Audio, Speech, and Language Processing | 2013

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

Zafar Rafii; Bryan Pardo

Repetition is a core principle in music. Many musical pieces are characterized by an underlying repeating structure over which varying elements are superimposed. This is especially true for pop songs where a singer often overlays varying vocals on a repeating accompaniment. On this basis, we present the REpeating Pattern Extraction Technique (REPET), a novel and simple approach for separating the repeating “background” from the non-repeating “foreground” in a mixture. The basic idea is to identify the periodically repeating segments in the audio, compare them to a repeating segment model derived from them, and extract the repeating patterns via time-frequency masking. Experiments on data sets of 1,000 song clips and 14 full-track real-world songs showed that this method can be successfully applied for music/voice separation, competing with two recent state-of-the-art approaches. Further experiments showed that REPET can also be used as a preprocessor to pitch detection algorithms to improve melody extraction.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-Peak Regions

Zhiyao Duan; Bryan Pardo; Changshui Zhang

This paper presents a maximum-likelihood approach to multiple fundamental frequency (F0) estimation for a mixture of harmonic sound sources, where the power spectrum of a time frame is the observation and the F0s are the parameters to be estimated. When defining the likelihood model, the proposed method models both spectral peaks and non-peak regions (frequencies further than a musical quarter tone from all observed peaks). It is shown that the peak likelihood and the non-peak region likelihood act as a complementary pair. The former helps find F0s that have harmonics that explain peaks, while the latter helps avoid F0s that have harmonics in non-peak regions. Parameters of these models are learned from monophonic and polyphonic training data. This paper proposes an iterative greedy search strategy to estimate F0s one by one, to avoid the combinatorial problem of concurrent F0 estimation. It also proposes a polyphony estimation method to terminate the iterative process. Finally, this paper proposes a postprocessing method to refine polyphony and F0 estimates using neighboring frames. This paper also analyzes the relative contributions of different components of the proposed method. It is shown that the refinement component eliminates many inconsistent estimation errors. Evaluations are done on ten recorded four-part J. S. Bach chorales. Results show that the proposed method shows superior F0 estimation and polyphony estimation compared to two state-of-the-art algorithms.

acm/ieee joint conference on digital libraries | 2002

HMM-based musical query retrieval

Jonah Shifrin; Bryan Pardo; Colin Meek; William P. Birmingham

We have created a system for music search and retrieval. A user sings a theme from the desired piece of music. Pieces in the database are represented as hidden Markov models (HMMs). The query is treated as an observation sequence and a piece is judged similar to the query if its HMM has a high likelihood of generating the query. The top pieces are returned to the user in rank-order. This paper reports the basic approach for the construction of the target database of themes, encoding and transcription of user queries, and the results of initial experimentation with a small set of sung queries.

human factors in computing systems | 2013

Crowdfunding support tools: predicting success & failure

Michael D. Greenberg; Bryan Pardo; Karthic Hariharan; Elizabeth M. Gerber

Creative individuals increasingly rely on online crowdfunding platforms to crowdsource funding for new ventures. For novice crowdfunding project creators, however, there are few resources to turn to for assistance in the planning of crowdfunding projects. We are building a tool for novice project creators to get feedback on their project designs. One component of this tool is a comparison to existing projects. As such, we have applied a variety of machine learning classifiers to learn the concept of a successful online crowdfunding project at the time of project launch. Currently our classifier can predict with roughly 68% accuracy, whether a project will be successful or not. The classification results will eventually power a prediction segment of the proposed feedback tool. Future work involves turning the results of the machine learning algorithms into human-readable content and integrating this content into the feedback tool.

IEEE Journal of Selected Topics in Signal Processing | 2011

Soundprism: An Online System for Score-Informed Source Separation of Music Audio

Zhiyao Duan; Bryan Pardo

Soundprism, as proposed in this paper, is a computer system that separates single-channel polyphonic music audio played by harmonic sources into source signals in an online fashion. It uses a musical score to guide the separation process. To the best of our knowledge, this is the first online system that addresses score-informed music source separation that can be made into a real-time system. The proposed system consists of two parts: 1) a score follower that associates a score position to each time frame of the audio performance; 2) a source separator which reconstructs the source signals for each time frame, informed by the score. The score follower uses a hidden Markov approach, where each audio frame is associated with a 2-D state vector (score position and tempo). The observation model is defined as the likelihood of observing the frame given the pitches at the score position. The score position and tempo are inferred using particle filtering. In building the source separator, we first refine the score-informed pitches of the current audio frame by maximizing the multi-pitch observation likelihood. Then, the harmonics of each sources fundamental frequency are extracted to reconstruct the source signal. Overlapping harmonics between sources are identified and their energy is distributed in inverse proportion to the square of their respective harmonic number. Experiments on both synthetic and human-performed music show both the score follower and the source separator perform well. Results also show that the proposed score follower works well for highly polyphonic music with some degree of tempo variations.

Journal of the Association for Information Science and Technology | 2004

Name that tune: a pilot study in finding a melody from a sung query

Bryan Pardo; Jonah Shifrin; William P. Birmingham

We have created a system for music search and retrieval. A user sings a theme from the desired piece of music. The sung theme (query) is converted into a sequence of pitch-intervals and rhythms. This sequence is compared to musical themes (targets) stored in a data-base. The top pieces are returned to the user in order of similarity to the sung theme. We describe, in detail, two different approaches to measuring similarity between database themes and the sung query. In the first, queries are compared to database themes using standard string-alignment algorithms. Here, similarity between target and query is determined by edit cost. In the second approach, pieces in the database are represented as hidden Markov models (HMMs). In this approach, the query is treated as an observation sequence and a target is judged similar to the query if its HMM has a high likelihood of generating the query. In this article we report our approach to the construction of a target database of themes, encoding, and transcription of user queries, and the results of preliminary experimentation with a set of sung queries. Our experiments show that while no approach is clearly superior to the other system, string matching has a slight advantage. Moreover, neither approach surpasses human performance.

IEEE Signal Processing Magazine | 2014

Score-Informed Source Separation for Musical Audio Recordings: An overview

Sebastian Ewert; Bryan Pardo; Meinard Mueller; Mark D. Plumbley

In recent years, source separation has been a central research topic in music signal processing, with applications in stereo-to-surround up-mixing, remixing tools for disc jockeys or producers, instrument-wise equalizing, karaoke systems, and preprocessing in music analysis tasks. Musical sound sources, however, are often strongly correlated in time and frequency, and without additional knowledge about the sources, a decomposition of a musical recording is often infeasible. To simplify this complex task, various methods have recently been proposed that exploit the availability of a musical score. The additional instrumentation and note information provided by the score guides the separation process, leading to significant improvements in terms of separation quality and robustness. A major challenge in utilizing this rich source of information is to bridge the gap between high-level musical events specified by the score and their corresponding acoustic realizations in an audio recording. In this article, we review recent developments in score-informed source separation and discuss various strategies for integrating the prior knowledge encoded by the score.

IEEE Transactions on Signal Processing | 2014

Kernel Additive Models for Source Separation

Antoine Liutkus; Derry Fitzgerald; Zafar Rafii; Bryan Pardo; Laurent Daudet

Source separation consists of separating a signal into additive components. It is a topic of considerable interest with many applications that has gathered much attention recently. Here, we introduce a new framework for source separation called Kernel Additive Modelling, which is based on local regression and permits efficient separation of multidimensional and/or nonnegative and/or non-regularly sampled signals. The main idea of the method is to assume that a source at some location can be estimated using its values at other locations nearby, where nearness is defined through a source-specific proximity kernel. Such a kernel provides an efficient way to account for features like periodicity, continuity, smoothness, stability over time or frequency, and self-similarity. In many cases, such local dynamics are indeed much more natural to assess than any global model such as a tensor factorization. This framework permits one to use different proximity kernels for different sources and to separate them using the iterative kernel backfitting algorithm we describe. As we show, kernel additive modelling generalizes many recent and efficient techniques for source separation and opens the path to creating and combining source models in a principled way. Experimental results on the separation of synthetic and audio signals demonstrate the effectiveness of the approach.

international conference on acoustics, speech, and signal processing | 2012

Adaptive filtering for music/voice separation exploiting the repeating musical structure

Antoine Liutkus; Zafar Rafii; Roland Badeau; Bryan Pardo; Gaël Richard

The separation of the lead vocals from the background accompaniment in audio recordings is a challenging task. Recently, an efficient method called REPET (REpeating Pattern Extraction Technique) has been proposed to extract the repeating background from the non-repeating foreground. While effective on individual sections of a song, REPET does not allow for variations in the background (e.g. verse vs. chorus), and is thus limited to short excerpts only. We overcome this limitation and generalize REPET to permit the processing of complete musical tracks. The proposed algorithm tracks the period of the repeating structure and computes local estimates of the background pattern. Separation is performed by soft time-frequency masking, based on the deviation between the current observation and the estimated background pattern. Evaluation on a dataset of 14 complete tracks shows that this method can perform at least as well as a recent competitive music/voice separation method, while being computationally efficient.

multimedia signal processing | 2009

Classifying paintings by artistic genre: An analysis of features & classifiers

Jana Zujovic; Lisa Gandy; Scott E. Friedman; Bryan Pardo; Thrasyvoulos N. Pappas

This paper describes an approach to automatically classify digital pictures of paintings by artistic genre. While the task of artistic classification is often entrusted to human experts, recent advances in machine learning and multimedia feature extraction has made this task easier to automate. Automatic classification is useful for organizing large digital collections, for automatic artistic recommendation, and even for mobile capture and identification by consumers. Our evaluation uses variableresolution painting data gathered across Internet sources rather than solely using professional high-resolution data. Consequently, we believe this solution better addresses the task of classifying consumer-quality digital captures than other existing approaches. We include a comparison to existing feature extraction and classification methods as well as an analysis of our own approach across classifiers and feature vectors.

Explore More