Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hagai Attias is active.

Publication


Featured researches published by Hagai Attias.


Neural Computation | 1999

Independent factor analysis

Hagai Attias

We introduce the independent factor analysis (IFA) method for recovering independent hidden sources from their observed mixtures. IFA generalizes and unifies ordinary factor analysis (FA), principal component analysis (PCA), and independent component analysis (ICA), and can handle not only square noiseless mixing but also the general case where the number of mixtures differs from the number of sources and the data are noisy. IFA is a two-step procedure. In the first step, the source densities, mixing matrix, and noise covariance are estimated from the observed data by maximum likelihood. For this purpose we present an expectation-maximization (EM) algorithm, which performs unsupervised learning of an associated probabilistic model of the mixing situation. Each source in our model is described by a mixture of gaussians; thus, all the probabilistic calculations can be performed analytically. In the second step, the sources are reconstructed from the observed data by an optimal nonlinear estimator. A variational approximation of this algorithm is derived for cases with a large number of sources, where the exact algorithm becomes intractable. Our IFA algorithm reduces to the one for ordinary FA when the sources become gaussian, and to an EM algorithm for PCA in the zero-noise limit. We derive an additional EM algorithm specifically for noiseless IFA. This algorithm is shown to be superior to ICA since it can learn arbitrary source densities from the data. Beyond blind separation, IFA can be used for modeling multidimensional data by a highly constrained mixture of gaussians and as a tool for nonlinear signal encoding.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Blind Source Separation Exploiting Higher-Order Frequency Dependencies

Taesu Kim; Hagai Attias; Soo-Young Lee; Te-Won Lee

Blind source separation (BSS) is a challenging problem in real-world environments where sources are time delayed and convolved. The problem becomes more difficult in very reverberant conditions, with an increasing number of sources, and geometric configurations of the sources such that finding directionality is not sufficient for source separation. In this paper, we propose a new algorithm that exploits higher order frequency dependencies of source signals in order to separate them when they are mixed. In the frequency domain, this formulation assumes that dependencies exist between frequency bins instead of defining independence for each frequency bin. In this manner, we can avoid the well-known frequency permutation problem. To derive the learning algorithm, we define a cost function, which is an extension of mutual information between multivariate random variables. By introducing a source prior that models the inherent frequency dependencies, we obtain a simple form of a multivariate score function. In experiments, we generate simulated data with various kinds of sources in various environments. We evaluate the performances and compare it with other well-known algorithms. The results show the proposed algorithm outperforms the others in most cases. The algorithm is also able to accurately recover six sources with six microphones. In this case, we can obtain about 16-dB signal-to-interference ratio (SIR) improvement. Similar performance is observed in real conference room recordings with three human speakers reading sentences and one loudspeaker playing music


computer vision and pattern recognition | 2010

Topic regression multi-modal Latent Dirichlet Allocation for image annotation

Duangmanee Putthividhy; Hagai Attias; Srikantan S. Nagarajan

We present topic-regression multi-modal Latent Dirich-let Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in [2], our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.


Neural Computation | 1998

Blind source separation and deconvolution: the dynamic component analysis algorithm

Hagai Attias; Christoph E. Schreiner

We derive a novel family of unsupervised learning algorithms for blind separation of mixed and convolved sources. Our approach is based on formulating the separation problem as a learning task of a spatiotemporal generative model, whose parameters are adapted iteratively to minimize suitable error functions, thus ensuring stability of the algorithms. The resulting learning rules achieve separation by exploiting high-order spatiotemporal statistics of the mixture data. Different rules are obtained by learning generative models in the frequency and time domains, whereas a hybrid frequency-time model leads to the best performance. These algorithms generalize independent component analysis to the case of convolutive mixtures and exhibit superior performance on instantaneous mixtures. An extension of the relative-gradient concept to the spatiotemporal case leads to fast and efficient learning rules with equivariant properties. Our approach can incorporate information about the mixing situation when available, ...


NeuroImage | 2010

Robust Bayesian estimation of the location, orientation, and time course of multiple correlated neural sources using MEG

David P. Wipf; Julia P. Owen; Hagai Attias; Kensuke Sekihara; Srikantan S. Nagarajan

The synchronous brain activity measured via MEG (or EEG) can be interpreted as arising from a collection (possibly large) of current dipoles or sources located throughout the cortex. Estimating the number, location, and time course of these sources remains a challenging task, one that is significantly compounded by the effects of source correlations and unknown orientations and by the presence of interference from spontaneous brain activity, sensor noise, and other artifacts. This paper derives an empirical Bayesian method for addressing each of these issues in a principled fashion. The resulting algorithm guarantees descent of a cost function uniquely designed to handle unknown orientations and arbitrary correlations. Robust interference suppression is also easily incorporated. In a restricted setting, the proposed method is shown to produce theoretically zero reconstruction error estimating multiple dipoles even in the presence of strong correlations and unknown orientations, unlike a variety of existing Bayesian localization methods or common signal processing techniques such as beamforming and sLORETA. Empirical results on both simulated and real data sets verify the efficacy of this approach.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

A graphical model for audiovisual object tracking

Matthew J. Beal; Nebojsa Jojic; Hagai Attias

We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.


NeuroImage | 2007

A probabilistic algorithm integrating source localization and noise suppression for MEG and EEG data.

Johanna M. Zumer; Hagai Attias; Kensuke Sekihara; Srikantan S. Nagarajan

We have developed a novel probabilistic model that estimates neural source activity measured by MEG and EEG data while suppressing the effect of interference and noise sources. The model estimates contributions to sensor data from evoked sources, interference sources and sensor noise using Bayesian methods and by exploiting knowledge about their timing and spatial covariance properties. Full posterior distributions are computed rather than just the MAP estimates. In simulation, the algorithm can accurately localize and estimate the time courses of several simultaneously active dipoles, with rotating or fixed orientation, at noise levels typical for averaged MEG data. The algorithm even performs reasonably at noise levels typical of an average of just a few trials. The algorithm is superior to beamforming techniques, which we show to be an approximation to our graphical model, in estimation of temporally correlated sources. Success of this algorithm using MEG data for localizing bilateral auditory cortex, low-SNR somatosensory activations, and for localizing an epileptic spike source are also demonstrated.


international conference on acoustics, speech, and signal processing | 2004

A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances

Li Deng; Leo Lee; Hagai Attias; Alex Acero

A novel approach is developed for efficient and accurate tracking of vocal tract resonances, which are natural frequencies of the resonator from larynx to lips, in fluent speech. The tracking algorithm is based on a version of the structured speech model consisting of continuous-valued hidden dynamics and a piecewise-linearized prediction function from resonance frequencies and bandwidths to LPC cepstra. We present details of the piecewise linearization design process and an adaptive training technique for the parameters that characterize the prediction residuals. An iterative tracking algorithm is described and evaluated that embeds both the prediction-residual training and the piecewise linearization design in an adaptive Kalman filtering framework. Experiments on tracking vocal tract resonances in Switchboard speech data demonstrate high accuracy in the results, as well as the effectiveness of residual training embedded in the algorithm. Our approach differs from traditional formant trackers in that it provides meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics.


european conference on computer vision | 2002

Audio-Video Sensor Fusion with Probabilistic Graphical Models

Matthew J. Beal; Hagai Attias; Nebojsa Jojic

We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.


international conference on acoustics, speech, and signal processing | 2003

Variational inference and learning for segmental switching state space models of hidden speech dynamics

Leo Lee; Hagai Attias; Li Deng

This paper describes novel and powerful variational EM algorithms for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production. Hidden dynamic models (HDMs) have recently become a class of promising acoustic models to incorporate crucial speech-specific knowledge and overcome many inherent weaknesses of traditional HMMs. However, the lack of powerful and efficient statistical learning algorithms is one of the main obstacles preventing them from being well studied and widely used. Since exact inference and learning are intractable, a variational approach is taken to develop effective approximate algorithms. We have implemented the segmental constraint crucial for modeling speech dynamics and present algorithms for recovering hidden speech dynamics and discrete speech units from acoustic data only. The effectiveness of the algorithms developed are verified by experiments on simulation and Switchboard speech data.

Collaboration


Dive into the Hagai Attias's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kensuke Sekihara

Tokyo Metropolitan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Julia P. Owen

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge