Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Demba Ba is active.

Publication


Featured researches published by Demba Ba.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Using Reverberation to Improve Range and Elevation Discrimination for Small Array Sound Source Localization

Flavio Protasio Ribeiro; Cha Zhang; D. A. Florêncio; Demba Ba

Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms for small arrays still have two significant limitations: lack of range resolution, and accuracy degradation with increasing reverberation. The latter is natural and expected, given that strong reflections can have amplitudes similar to that of the direct signal, but different directions of arrival. Therefore, correctly modeling the room and compensating for the reflections should reduce the degradation due to reverberation. In this paper, we show a stronger result. If modeled correctly, early reflections can be used to provide more information about the source location than would have been available in an anechoic scenario. The modeling not only compensates for the reverberation, but also significantly increases resolution for range and elevation. Thus, we show that under certain conditions and limitations, reverberation can be used to improve SSL performance. Prior attempts to compensate for reverberation tried to model the room impulse response (RIR). However, RIRs change quickly with speaker position, and are nearly impossible to track accurately. Instead, we build a 3-D model of the room, which we use to predict early reflections, which are then incorporated into the SSL estimation. Simulation results with real and synthetic data show that even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.


international conference on acoustics, speech, and signal processing | 2010

L1 regularized room modeling with compact microphone arrays

Demba Ba; Flavio Protasio Ribeiro; Cha Zhang; Dinei A. F. Florêncio

Acoustic room modeling has several applications. Recent results using large microphone arrays show good performance, and are helpful in many applications. For example, when designing a better acoustic treatment for a concert hall, these large arrays can be used to help map the acoustic environment and aid in the design. However, in real-time applications - including de-reverberation, sound source localization, speech enhancement and 3D audio - it is desirable to model the room with existing small arrays and existing loudspeakers. In this paper we propose a novel room modeling algorithm, which uses a constrained room model and ℓ1-regularized least-squares to achieve good estimation of room geometry. We present experimental results on both real and synthetic data.


IEEE Transactions on Audio, Speech, and Language Processing | 2012

Geometrically Constrained Room Modeling With Compact Microphone Arrays

Flavio Protasio Ribeiro; Dinei A. F. Florêncio; Demba Ba; Cha Zhang

The geometry of an acoustic environment can be an important information in many audio signal processing applications. To estimate such a geometry, previous work has relied on large microphone arrays, multiple test sources, moving sources or the assumption of a 2-D room. In this paper, we lift these requirements and present a novel method that uses a compact microphone array to estimate a 3-D room geometry, delivering effective estimates with low-cost hardware. Our approach first probes the environment with a known test signal emitted by a loudspeaker co-located with the array, from which the room impulse responses (RIRs) are estimated. It then uses an ℓ1-regularized least-squares minimization to fit synthetically generated reflections to the RIRs, producing a sparse set of reflections. By enforcing structural constraints derived from the image model, these are classified into first-, second-, and third-order reflections, thereby deriving the room geometry. Using this method, we detect walls using off-the-shelf teleconferencing hardware with a typical range resolution of about 1 cm. We present results using simulations and data from real environments.


international conference on multimedia and expo | 2007

Enhanced MVDR Beamforming for Arrays of Directional Microphones

Demba Ba; Dinei A. F. Florêncio; Cha Zhang

Microphone arrays based on the minimum variance distortionless response (MVDR) beamformer are among the most popular for speech enhancement applications. The original MVDR is excessively sensitive to source location and microphone gains. Previous research has made MVDR practical by successfully increasing the robustness of MVDR to source location, and MVDR-based microphone arrays are already commercially available. Nevertheless, MVDR performance is still weak in cases where microphone gain variations are too large, e.g., for circular arrays of directional microphones. In this paper we propose an improved MVDR beamformer which takes into account the effect of sensors (e.g. microphones) with arbitrary, potentially directional responses. Specifically, we form estimates of the relative magnitude responses of the sensors based on the data received at the array and include those in the original formulation of the MVDR beamforming problem. Experimental results on real-world audio data show an average 2.4 dB improvement over conventional MVDR beamforming, which does not account for the magnitude responses of the sensors.


international conference on multimedia and expo | 2010

Turning enemies into friends: Using reflections to improve sound source localization

Flavio Protasio Ribeiro; Demba Ba; Cha Zhang; Dinei A. F. Florêncio

Sound Source Localization (SSL) based on microphone arrays has numerous applications, and has received significant research attention. Common to all published research is the observation that the accuracy of SSL degrades with reverberation. Indeed, early (strong) reflections can have amplitudes similar to the direct signal, and will often interfere with the estimation. In this paper, we show that reverberation is not the enemy, and can be used to improve estimation. More specifically, we are able to use early reflections to significantly improve range and elevation estimation. The process requires two steps: during setup, a loudspeaker integrated with the array emits a probing sound, which is used to obtain estimates of the ceiling height, as well as the locations of the walls. In a second step (e.g., during a meeting), the device incorporates this knowledge into a maximum likelihood SSL algorithm. Experimental results on both real and synthetic data show huge improvements in range estimation accuracy.


Proceedings of the National Academy of Sciences of the United States of America | 2014

Robust spectrotemporal decomposition by iteratively reweighted least squares

Demba Ba; Behtash Babadi; Patrick L. Purdon; Emery N. Brown

Significance Classical spectral estimation techniques use sliding windows to enforce temporal smoothness of the spectral estimates of signals with time-varying spectrotemporal representations. This widely applied approach is not well-suited to signals that have low-dimensional, highly structured time–frequency representations. We develop a new Bayesian spectral decomposition framework—spectrotemporal pursuit—to compute spectral estimates that are smooth in time and sparse in frequency. We use a statistical interpretation of sparse recovery to derive efficient algorithms for computing spectrotemporal pursuit spectral estimates. We apply spectrotemporal pursuit to achieve a more precise delineation of the oscillatory structure of human electroencephalogram and neural spiking data under propofol general anesthesia. Spectrotemporal pursuit offers a principled alternative to existing methods for decomposing a signal into a small number of oscillatory components. Classical nonparametric spectral analysis uses sliding windows to capture the dynamic nature of most real-world time series. This universally accepted approach fails to exploit the temporal continuity in the data and is not well-suited for signals with highly structured time–frequency representations. For a time series whose time-varying mean is the superposition of a small number of oscillatory components, we formulate nonparametric batch spectral analysis as a Bayesian estimation problem. We introduce prior distributions on the time–frequency plane that yield maximum a posteriori (MAP) spectral estimates that are continuous in time yet sparse in frequency. Our spectral decomposition procedure, termed spectrotemporal pursuit, can be efficiently computed using an iteratively reweighted least-squares algorithm and scales well with typical data lengths. We show that spectrotemporal pursuit works by applying to the time series a set of data-derived filters. Using a link between Gaussian mixture models, ℓ1 minimization, and the expectation–maximization algorithm, we prove that spectrotemporal pursuit converges to the global MAP estimate. We illustrate our technique on simulated and real human EEG data as well as on human neural spiking activity recorded during loss of consciousness induced by the anesthetic propofol. For the EEG data, our technique yields significantly denoised spectral estimates that have significantly higher time and frequency resolution than multitaper spectral estimates. For the neural spiking data, we obtain a new spectral representation of neuronal firing rates. Spectrotemporal pursuit offers a robust spectral decomposition framework that is a principled alternative to existing methods for decomposing time series into a small number of smooth oscillatory components.


Proceedings of the National Academy of Sciences of the United States of America | 2015

Measuring the signal-to-noise ratio of a neuron

Gabriela Czanner; Sridevi V. Sarma; Demba Ba; Uri T. Eden; Wei Wu; Emad N. Eskandar; Hubert H. Lim; Simona Temereanca; Wendy A. Suzuki; Emery N. Brown

Significance Neurons represent both signal and noise in binary electrical discharges termed action potentials. Hence, the standard signal-to-noise ratio (SNR) definition of signal amplitude squared and divided by the noise variance does not apply. We show that the SNR estimates a ratio of expected prediction errors. Using point process generalized linear models, we extend the standard definition to one appropriate for single neurons. In analyses of four neural systems, we show that single neuron SNRs range from −29 dB to −3 dB and that spiking history is often a more informative predictor of spiking propensity than the signal or stimulus activating the neuron. By generalizing the standard SNR metric, we make explicit the well-known fact that individual neurons are highly noisy information transmitters. The signal-to-noise ratio (SNR), a commonly used measure of fidelity in physical systems, is defined as the ratio of the squared amplitude or variance of a signal relative to the variance of the noise. This definition is not appropriate for neural systems in which spiking activity is more accurately represented as point processes. We show that the SNR estimates a ratio of expected prediction errors and extend the standard definition to one appropriate for single neurons by representing neural spiking activity using point process generalized linear models (PP-GLM). We estimate the prediction errors using the residual deviances from the PP-GLM fits. Because the deviance is an approximate χ2 random variable, we compute a bias-corrected SNR estimate appropriate for single-neuron analysis and use the bootstrap to assess its uncertainty. In the analyses of four systems neuroscience experiments, we show that the SNRs are −10 dB to −3 dB for guinea pig auditory cortex neurons, −18 dB to −7 dB for rat thalamic neurons, −28 dB to −14 dB for monkey hippocampal neurons, and −29 dB to −20 dB for human subthalamic neurons. The new SNR definition makes explicit in the measure commonly used for physical systems the often-quoted observation that single neurons have low SNRs. The neuron’s spiking history is frequently a more informative covariate for predicting spiking propensity than the applied stimulus. Our new SNR definition extends to any GLM system in which the factors modulating the response can be expressed as separate components of a likelihood function.


Neural Computation | 2014

Likelihood methods for point processes with refractoriness

Luca Citi; Demba Ba; Emery N. Brown; Riccardo Barbieri

Likelihood-based encoding models founded on point processes have received significant attention in the literature because of their ability to reveal the information encoded by spiking neural populations. We propose an approximation to the likelihood of a point-process model of neurons that holds under assumptions about the continuous time process that are physiologically reasonable for neural spike trains: the presence of a refractory period, the predictability of the conditional intensity function, and its integrability. These are properties that apply to a large class of point processes arising in applications other than neuroscience. The proposed approach has several advantages over conventional ones. In particular, one can use standard fitting procedures for generalized linear models based on iteratively reweighted least squares while improving the accuracy of the approximation to the likelihood and reducing bias in the estimation of the parameters of the underlying continuous-time model. As a result, the proposed approach can use a larger bin size to achieve the same accuracy as conventional approaches would with a smaller bin size. This is particularly important when analyzing neural data with high mean and instantaneous firing rates. We demonstrate these claims on simulated and real neural spiking activity. By allowing a substantive increase in the required bin size, our algorithm has the potential to lower the barrier to the use of point-process methods in an increasing number of applications.


Cell | 2018

Corticoamygdala Transfer of Socially Derived Information Gates Observational Learning

Stephen A. Allsop; Romy Wichmann; Fergil Mills; Anthony Noel Burgos-Robles; Chia-Jung Chang; Ada C. Felix-Ortiz; Alienor Vienne; Anna Beyeler; Ehsan M. Izadmehr; Gordon Glober; Meghan I. Cum; Johanna Stergiadou; Kavitha K. Anandalingam; Kathryn M. Farris; Praneeth Namburi; Christopher A. Leppla; Javier C. Weddington; Edward H. Nieh; Anne C. Smith; Demba Ba; Emery N. Brown; Kay M. Tye

Observational learning is a powerful survival tool allowing individuals to learn about threat-predictive stimuli without directly experiencing the pairing of the predictive cue and punishment. This ability has been linked to the anterior cingulate cortex (ACC) and the basolateral amygdala (BLA). To investigate how information is encoded and transmitted through this circuit, we performed electrophysiological recordings in mice observing a demonstrator mouse undergo associative fear conditioning and found that BLA-projecting ACC (ACC→BLA) neurons preferentially encode socially derived aversive cue information. Inhibition of ACC→BLA alters real-time amygdala representation of the aversive cue during observational conditioning. Selective inhibition of the ACC→BLA projection impaired acquisition, but not expression, of observational fear conditioning. We show that information derived from observation about the aversive value of the cue is transmitted from the ACC to the BLA and that this routing of information is critically instructive for observational fear conditioning. VIDEO ABSTRACT.


international conference of the ieee engineering in medicine and biology society | 2009

A regularized point process generalized linear model for assessing the functional connectivity in the cat motor cortex

Zhe Chen; David Putrino; Demba Ba; Soumya Ghosh; Riccardo Barbieri; Emery N. Brown

Identification of multiple simultaneously recorded neural spike train recordings is an important task in understanding neuronal dependency, functional connectivity, and temporal causality in neural systems. An assessment of the functional connectivity in a group of ensemble cells was performed using a regularized point process generalized linear model (GLM) that incorporates temporal smoothness or contiguity of the solution. An efficient convex optimization algorithm was then developed for the regularized solution. The point process model was applied to an ensemble of neurons recorded from the cat motor cortex during a skilled reaching task. The implications of this analysis to the coding of skilled movement in primary motor cortex is discussed.

Collaboration


Dive into the Demba Ba's collaboration.

Top Co-Authors

Avatar

Emery N. Brown

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew H. Song

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge