Brian John King | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brian John King is active.

Explore More

Publication

Featured researches published by Brian John King.

international workshop on machine learning for signal processing | 2012

Optimal cost function and magnitude power for NMF-based speech separation and music interpolation

Brian John King; Cédric Févotte; Paris Smaragdis

There has been a significant amount of research in new algorithms and applications for nonnegative matrix factorization, but relatively little has been published on practical considerations for real-world applications, such as choosing optimal parameters for a particular application. In this paper, we will look at two applications, single-channel source separation of speech and interpolating missing music data. We will present the optimal parameters found for the experiments as well as discuss how parameters affect performance.

IEEE Transactions on Audio, Speech, and Language Processing | 2011

Single-Channel Source Separation Using Complex Matrix Factorization

Brian John King; Les E. Atlas

Nonnegative matrix factorization is gaining popularity in speech and audio processing applications. Performing nonnegative matrix factorization on a complex-valued short-time Fourier transform, however, makes assumptions on the signal, such as additivity in the magnitude domain, potentially degrading the results. One application where these assumptions can cause a problem is in single-channel source separation of overlapping speech. In this paper, we present how this problem can be solved by incorporating phase estimation via complex matrix factorization. Another challenge in source separation is how to select reconstruction bases for optimal separation. In this paper, we compare the most common method with a new, simpler method of finding bases that does not share many of the challenges of the current, established method. The paper will conclude by comparing nonnegative with complex matrix factorization as well as the previous and new methods for finding bases on the task of automatic speech recognition of single-channel two-talker overlapping speech.

international conference on acoustics, speech, and signal processing | 2010

Single-channel source separation using simplified-training complex matrix factorization

Brian John King; Les E. Atlas

Although the task seems trivial for human listeners, research in automating source separation still lags far behind human performance and is especially difficult for single-channel signals. One of the latest and most promising methods of single-channel source separation is non-negative matrix factorization, which works by synthesizing signals from a learned set of bases for each source. In this paper, we present a new method of creating these learned sets of bases used in the matrix factorization technique for single-channel source separation. This new method does not suffer the complication of choosing an optimal number of bases as in previous methods. In addition, this paper further explores the new method of complex matrix factorization and compares its performance to non-negative, real matrix factorization for automatic speech recognition of two-talker mixtures.

international conference on acoustics, speech, and signal processing | 2012

Noise-robust dynamic time warping using PLCA features

Brian John King; Paris Smaragdis; Gautham J. Mysore

Conventional speech features, such as mel-frequency cepstral coefficients, tend to perform well in template matching systems, such as dynamic time warping, in low noise conditions. However, they tend to degrade in noisy environments. We propose a method of calculating features using the probabilistic latent component analysis (PLCA) framework. This framework models the speech and noise separately, leading to higher performance in noisy conditions than conventional methods. In this work, we compare our PLCA-based features with conventional features on the task of aligning a high-fidelity speech recording to a noisy speech recording, a scenario common in automatic dialogue replacement.

Journal of the Acoustical Society of America | 2011

Monaural source separation in underground spaces via non‐negative and complex matrix factorization.

Brian John King; Les E. Atlas

Automated separation of multiple independent acoustic sources collected on a single monaural sound channel is an active research area and can aid in many applications, including practical automatic speech recognition, speaker identification, and keyword identification in multitalker, noisy, and/or reverberberant multisource environments. Some of the latest and most promising methods of single‐channel source separation are non‐negative and the even more recent complex matrix factorization, which decompose a signal into a sparse linear combination of source‐specific building blocks, commonly referred to as bases. Once the bases and accompanying weights are calculated, separating a source from the mixture is achieved by multiplying and summing together its corresponding bases and weights. While experiments exhibit significant separation, the majority of this work has been done on studio‐recorded audio and consequently little done in more realistic acoustic environments. Some of the most important environment...

Archive | 2012