Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Scott Wisdom is active.

Publication


Featured researches published by Scott Wisdom.


Proceedings of the IEEE | 2013

Enabling Seamless Wireless Power Delivery in Dynamic Environments

Alanson P. Sample; Benjamin H. Waters; Scott Wisdom; Joshua R. Smith

Effective means of delivering wireless power to volumes of spaces will enable users the freedom and mobility to seamlessly power and recharge their devices in an unencumbered fashion. This has particular importance for consumer electronic, medical, and industrial applications, where usage models focus on unstructured and dynamic environments. However, existing wireless power technology falls short of this vision. Inductive charging solutions are limited to near-contact distances and require a docking station or precise placement for effective operation. Far-field wireless power techniques allow much greater range, but require complicated tracking systems to maintain a line-of-sight connection for high-efficiency power delivery to mobile applications. Recent work using magnetically coupled resonators (MCRs) for wireless power delivery has shown a promising intersection between range (on the order of a meter), efficiency (over 80%), and delivered power (up to tens of watts). However, unpredictable loads rapidly change system operating points, and changes in position disrupt system efficiency, which affects the ultimate usability of these systems. Dynamic adaptation to these changes in operating conditions and power transfer range is a critical capability in developing a fully functional and versatile wireless power solution. This paper provides an overview of methods used to adapt to variations in range, orientation, and load using both wideband and fixed-frequency techniques.


international conference on acoustics, speech, and signal processing | 2014

Extending coherence time for analysis of modulated random processes

Scott Wisdom; Les E. Atlas; James W. Pitton

In this paper, we relax a commonly-used assumption about a class of nonstationary random processes composed of modulated wide-sense stationary random processes: that the fundamental frequency of the modulator is stationary within the analysis window. To compensate for the relaxation of this assumption, we define the generalized DEMON (“demodulated noise”) spectrum representing modulation frequency, which we use to increase the coherence time of such signals. Increased coherence time means longer analysis windows, which provides higher SNR estimators. We use the example of detection on both synthetic and real-world passive sonar signals to demonstrate this increase.


international conference on acoustics, speech, and signal processing | 2016

Deep unfolding for multichannel source separation

Scott Wisdom; John R. Hershey; Jonathan Le Roux; Shinji Watanabe

Deep unfolding has recently been proposed to derive novel deep network architectures from model-based approaches. In this paper, we consider its application to multichannel source separation. We unfold a multichannel Gaussian mixture model (MCGMM), resulting in a deep MCGMM computational network that directly processes complex-valued frequency-domain multichannel audio and has an architecture defined explicitly by a generative model, thus combining the advantages of deep networks and model-based approaches. We further extend the deep MCGMM by modeling the GMM states using an MRF, whose unfolded mean-field inference updates add dynamics across layers. Experiments on source separation for multichannel mixtures of two simultaneous speakers shows that the deep MCGMM leads to improved performance with respect to the original MCGMM model.


international conference on acoustics, speech, and signal processing | 2015

Voice activity detection using subband noncircularity

Scott Wisdom; Greg Okopal; Les E. Atlas; James W. Pitton

Many voice activity detection (VAD) systems use the magnitude of complex-valued spectral representations. However, using only the magnitude often does not fully characterize the statistical behavior of the complex values. We present two novel methods for performing VAD on single- and dual-channel audio that do completely account for the second-order statistical behavior of complex data. Our methods exploit the second-order noncircularity (also known as impropriety) of complex subbands of speech and noise. Since speech tends to be more improper than noise, higher impropriety suggests speech activity. Our single-channel method is blind in the sense that it is unsupervised and, unlike many VAD systems, does not rely on non-speech periods for noise parameter estimation. Our methods achieve improved performance over other state-of-the-art magnitude-based VADs on the QUT-NOISE-TIMIT corpus, which indicates that impropriety is a compelling new feature for voice activity detection.


asilomar conference on signals, systems and computers | 2014

Estimating the noncircularity of latent components within complex-valued subband mixtures with applications to speech processing

Greg Okopal; Scott Wisdom; Les E. Atlas

This paper describes an approach that estimates the circularity coefficients of multiple underlying components within complex subbands of an additive mixture of voiced speech and noise via the strong uncorrelating transform (SUT). For the SUT to be effective, the latent source signals must have unique nonzero circularity coefficients; this requirement is satisfied by using narrow filters to impose a degree of noncircularity upon what would typically be circular noise. The circularity coefficient estimates are then used for voice activity detection, pitch tracking, and enhancement.


IEEE Transactions on Audio, Speech, and Language Processing | 2015

Speech analysis with the strong uncorrelating transform

Greg Okopal; Scott Wisdom; Les E. Atlas

The strong uncorrelating transform (SUT) provides estimates of independent components from linear mixtures using only second-order information, provided that the components have unique circularity coefficients. We propose a processing framework for generating complex-valued subbands from real-valued mixtures of speech and noise where the objective is to control the likely values of the sample circularity coefficients of the underlying speech and noise components in each subband. We show how several processing parameters affect the noncircularity of speech-like and noise components in the subband, ultimately informing parameter choices that allow for estimation of each of the components in a subband using the SUT. Additionally, because the speech and noise components will have unique sample circularity coefficients, this statistic can be used to identify time-frequency regions that contain voiced speech. We give an example of the recovery of the circularity coefficients of a real speech signal from a two-channel noisy mixture at -25 dB SNR, which demonstrates how the estimates of noncircularity can reveal the time-frequency structure of a speech signal in very high levels of noise. Finally, we present the results of a voice activity detection (VAD) experiment showing that two new circularity-based statistics, one of which is derived from the SUT processing, can achieve improved performance over state-of-the-art VADs in real-world recordings of noise.


international conference on acoustics, speech, and signal processing | 2017

Building recurrent networks by unfolding iterative thresholding for sequential sparse recovery

Scott Wisdom; Thomas Powers; James W. Pitton; Les E. Atlas

Historically, sparse methods and neural networks, particularly modern deep learning methods, have been relatively disparate areas. Sparse methods are typically used for signal enhancement, compression, and recovery, usually in an unsupervised framework, while neural networks commonly rely on a supervised training set. In this paper, we use the specific problem of sequential sparse recovery, which models a sequence of observations over time using a sequence of sparse coefficients, to show how algorithms for sparse modeling can be combined with supervised deep learning to improve sparse recovery. Specifically, we show that the iterative soft-thresholding algorithm (ISTA) for sequential sparse recovery corresponds to a stacked recurrent neural network (RNN) under specific architecture and parameter constraints. Then we demonstrate the benefit of training this RNN with backpropagation using supervised data for the task of column-wise compressive sensing of images. This training corresponds to adaptation of the original iterative thresholding algorithm and its parameters. Thus, we show by example that sparse modeling can provide a rich source of principled and structured deep network architectures that can be trained to improve performance on specific tasks.


asilomar conference on signals, systems and computers | 2014

Extending coherence for optimal detection of nonstationary harmonic signals

Scott Wisdom; James W. Pitton; Les E. Atlas

This paper describes an improved detector for nonstationary harmonic signals. The performance improvement is accomplished by using a novel method for extending the coherence time of such signals. This method applies a transformation to a noisy signal that attempts to fit a simple model to the signals slowly changing fundamental frequency over the analysis duration. By matching the change in the signals fundamental frequency, analysis is more coherent with the signal over longer durations, which allows the use of longer windows and thus improves detection performance.


sensor array and multichannel signal processing workshop | 2016

On spectral noncircularity of natural signals

Scott Wisdom; Les E. Atlas; James W. Pitton

Natural signals are typically nonstationary. The complex-valued frequency spectra of nonstationary signals do not have zero spectral correlation, as is assumed for wide-sense stationary processes. Instead, these spectra have non-zero second-order noncircular statistics-that is, they are not rotationally invariant-that are potentially useful for detection, classification, and enhancement. These noncircular statistics are especially significant for transient events, which are common in many natural signals. In this paper we provide practical and effective estimators for spectral noncircularity and spectral correlation. We illustrate the behavior of our spectral noncircularity estimators for synthetic signals. Then, we derive a generalized likelihood ratio test using both circular and noncircular models and show how estimates of spectral noncircularity provide performance improvements for detection of natural acoustic events.


New Era for Robust Speech Recognition, Exploiting Deep Learning | 2017

Novel Deep Architectures in Speech Processing

John R. Hershey; Jonathan Le Roux; Shinji Watanabe; Scott Wisdom; Zhuo Chen; Yusuf Isik

Model-based methods and deep neural networks have both been tremendously successful paradigms in machine learning. In model-based methods, problem domain knowledge can be built into the constraints of the model. In addition, unsupervised inference tasks such as adaptation and clustering are handled in a natural way. However, these benefits typically come at the expense of difficulties during inference. In contrast, deterministic deep neural networks are constructed in such a way that inference is straightforward, and discriminative training is relatively easy. However, their typically generic architectures often make it unclear how to incorporate specific problem knowledge or to perform flexible tasks such as unsupervised inference. This chapter introduces frameworks to provide the advantages of both approaches. To do so, we start with a model-based approach and an associated inference algorithm, and reinterpret inference iterations as layers in a deep network, while generalizing the parametrization to create a more powerful network. We show how such frameworks yield new understanding of conventional networks, and how they can result in novel networks for speech processing, including networks based on nonnegative matrix factorization, complex Gaussian microphone array signal processing, and a network inspired by efficient spectral clustering. We then discuss what has been learned in recent work and provide a prospectus for future research in this area.

Collaboration


Dive into the Scott Wisdom's collaboration.

Top Co-Authors

Avatar

Les E. Atlas

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Powers

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Greg Okopal

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John R. Hershey

Mitsubishi Electric Research Laboratories

View shared research outputs
Top Co-Authors

Avatar

Jonathan Le Roux

Mitsubishi Electric Research Laboratories

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shinji Watanabe

Mitsubishi Electric Research Laboratories

View shared research outputs
Researchain Logo
Decentralizing Knowledge