Brian John King
Adobe Systems
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brian John King.
international workshop on machine learning for signal processing | 2012
Brian John King; Cédric Févotte; Paris Smaragdis
There has been a significant amount of research in new algorithms and applications for nonnegative matrix factorization, but relatively little has been published on practical considerations for real-world applications, such as choosing optimal parameters for a particular application. In this paper, we will look at two applications, single-channel source separation of speech and interpolating missing music data. We will present the optimal parameters found for the experiments as well as discuss how parameters affect performance.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Brian John King; Les E. Atlas
Nonnegative matrix factorization is gaining popularity in speech and audio processing applications. Performing nonnegative matrix factorization on a complex-valued short-time Fourier transform, however, makes assumptions on the signal, such as additivity in the magnitude domain, potentially degrading the results. One application where these assumptions can cause a problem is in single-channel source separation of overlapping speech. In this paper, we present how this problem can be solved by incorporating phase estimation via complex matrix factorization. Another challenge in source separation is how to select reconstruction bases for optimal separation. In this paper, we compare the most common method with a new, simpler method of finding bases that does not share many of the challenges of the current, established method. The paper will conclude by comparing nonnegative with complex matrix factorization as well as the previous and new methods for finding bases on the task of automatic speech recognition of single-channel two-talker overlapping speech.
international conference on acoustics, speech, and signal processing | 2010
Brian John King; Les E. Atlas
Although the task seems trivial for human listeners, research in automating source separation still lags far behind human performance and is especially difficult for single-channel signals. One of the latest and most promising methods of single-channel source separation is non-negative matrix factorization, which works by synthesizing signals from a learned set of bases for each source. In this paper, we present a new method of creating these learned sets of bases used in the matrix factorization technique for single-channel source separation. This new method does not suffer the complication of choosing an optimal number of bases as in previous methods. In addition, this paper further explores the new method of complex matrix factorization and compares its performance to non-negative, real matrix factorization for automatic speech recognition of two-talker mixtures.
international conference on acoustics, speech, and signal processing | 2012
Brian John King; Paris Smaragdis; Gautham J. Mysore
Conventional speech features, such as mel-frequency cepstral coefficients, tend to perform well in template matching systems, such as dynamic time warping, in low noise conditions. However, they tend to degrade in noisy environments. We propose a method of calculating features using the probabilistic latent component analysis (PLCA) framework. This framework models the speech and noise separately, leading to higher performance in noisy conditions than conventional methods. In this work, we compare our PLCA-based features with conventional features on the task of aligning a high-fidelity speech recording to a noisy speech recording, a scenario common in automatic dialogue replacement.
Journal of the Acoustical Society of America | 2011
Brian John King; Les E. Atlas
Automated separation of multiple independent acoustic sources collected on a single monaural sound channel is an active research area and can aid in many applications, including practical automatic speech recognition, speaker identification, and keyword identification in multitalker, noisy, and/or reverberberant multisource environments. Some of the latest and most promising methods of single‐channel source separation are non‐negative and the even more recent complex matrix factorization, which decompose a signal into a sparse linear combination of source‐specific building blocks, commonly referred to as bases. Once the bases and accompanying weights are calculated, separating a source from the mixture is achieved by multiplying and summing together its corresponding bases and weights. While experiments exhibit significant separation, the majority of this work has been done on studio‐recorded audio and consequently little done in more realistic acoustic environments. Some of the most important environment...
Archive | 2012
Brian John King; Gautham J. Mysore; Paris Smaragdis
Archive | 2012
Brian John King; Gautham J. Mysore; Paris Smaragdis
Archive | 2012
Brian John King; Gautham J. Mysore; Paris Smaragdis
Archive | 2012
Brian John King; Gautham J. Mysore; Paris Smaragdis
Archive | 2012
Brian John King; Gautham J. Mysore; Paris Smaragdis