Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Athanasios Mouchtaris is active.

Publication


Featured researches published by Athanasios Mouchtaris.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Real-Time Multiple Sound Source Localization and Counting Using a Circular Microphone Array

Despoina Pavlidi; Anthony Griffin; Matthieu Puigt; Athanasios Mouchtaris

In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are applicable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

Nonparallel training for voice conversion based on a parameter adaptation approach

Athanasios Mouchtaris; J. Van der Spiegel; P. Mueller

The objective of voice conversion algorithms is to modify the speech by a particular source speaker so that it sounds as if spoken by a different target speaker. Current conversion algorithms employ a training procedure, during which the same utterances spoken by both the source and target speakers are needed for deriving the desired conversion parameters. Such a (parallel) corpus, is often difficult or impossible to collect. Here, we propose an algorithm that relaxes this constraint, i.e., the training corpus does not necessarily contain the same utterances from both speakers. The proposed algorithm is based on speaker adaptation techniques, adapting the conversion parameters derived for a particular pair of speakers to a different pair, for which only a nonparallel corpus is available. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30%. A speaker identification measure is also employed that more insightfully portrays the importance of adaptation, while listening tests confirm the success of our method. Both the objective and subjective tests employed, demonstrate that the proposed algorithm achieves comparable results with the ideal case when a parallel corpus is available.


international conference on acoustics, speech, and signal processing | 2004

Non-parallel training for voice conversion by maximum likelihood constrained adaptation

Athanasios Mouchtaris; J. Van der Spiegel; P. Mueller

The objective of voice conversion methods is to modify the speech characteristics of a particular speaker in such manner, as to sound like speech by a different target speaker. Current voice conversion algorithms are based on deriving a conversion function by estimating its parameters through a corpus that contains the same utterances spoken by both speakers. Such a corpus, usually referred to as a parallel corpus, has the disadvantage that many times it is difficult or even impossible to collect. Here, we propose a voice conversion method that does not require a parallel corpus for training, i.e. the spoken utterances by the two speakers need not be the same, by employing speaker adaptation techniques to adapt to a particular pair of source and target speakers, the derived conversion parameters from a different pair of speakers. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30% in many cases, and with performance comparable with the ideal case when a parallel corpus is available.


IEEE Transactions on Multimedia | 2000

Inverse filter design for immersive audio rendering over loudspeakers

Athanasios Mouchtaris; Panagiotis Reveliotis; Chris Kyriakakis

Immersive audio systems can be used to render virtual sound sources in three-dimensional (3-D) space around a listener. This is achieved by simulating the head-related transfer function (HRTF) amplitude and phase characteristics using digital filters. In this paper, we examine certain key signal processing considerations in spatial sound rendering over headphones and loudspeakers. We address the problem of crosstalk inherent in loudspeaker rendering and examine two methods for implementing crosstalk cancellation and loudspeaker frequency response inversion in real time. We demonstrate that it is possible to achieve crosstalk cancellation of 30 dB using both methods, but one of the two (the Fast RLS Transversal Filter Method) offers a significant advantage in terms of computational efficiency. Our analysis is easily extendable to nonsymmetric listening positions and moving listeners.


Signal Processing | 2015

Localizing multiple audio sources in a wireless acoustic sensor network

Anthony Griffin; Anastasios Alexandridis; Despoina Pavlidi; Yiannis Mastorakis; Athanasios Mouchtaris

In this work, we propose a grid-based method to estimate the location of multiple sources in a wireless acoustic sensor network, where each sensor node contains a microphone array and only transmits direction-of-arrival (DOA) estimates in each time interval, reducing the transmissions to the central processing node. We present new work on modeling the DOA estimation error in such a scenario. Through extensive, realistic simulations, we show that our method outperforms other state-of-the-art methods, in both accuracy and complexity. We also present localization results of real recordings in an outdoor cell of a sensor network. HighlightsWe examine localization in a WASN where each node transmits DOA estimates.We perform DOA estimation error modeling and examine the merging of nearby sources.We present a real-time low-complexity method for localization of multiple sources.Results indicate the advantages of our method in accuracy/computational complexity.We present localization results of real recordings in an outdoor cell of a sensor network.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

Single-Channel and Multi-Channel Sinusoidal Audio Coding Using Compressed Sensing

Anthony Griffin; Toni Hirvonen; Christos Tzagkarakis; Athanasios Mouchtaris; Panagiotis Tsakalides

Compressed sensing (CS) samples signals at a much lower rate than the Nyquist rate if they are sparse in some basis. In this paper, the CS methodology is applied to sinusoidally modeled audio signals. As this model is sparse by definition in the frequency domain (being equal to the sum of a small number of sinusoids), we investigate whether CS can be used to encode audio signals at low bitrates. In contrast to encoding the sinusoidal parameters (amplitude, frequency, phase) as current state-of-the-art methods do, we propose encoding few randomly selected samples of the time-domain description of the sinusoidal component (per signal segment). The potential of applying compressed sensing both to single-channel and multi-channel audio coding is examined. The listening test results are encouraging, indicating that the proposed approach can achieve comparable performance to that of state-of-the-art methods. Given that CS can lead to novel coding systems where the sampling and compression operations are combined into one low-complexity step, the proposed methodology can be considered as an important step towards applying the CS framework to audio coding applications.


international conference on acoustics, speech, and signal processing | 2012

Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures

Despoina Pavlidi; Matthieu Puigt; Anthony Griffin; Athanasios Mouchtaris

We propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time.


international conference on acoustics, speech, and signal processing | 2006

Musical Genre Classification VIA Generalized Gaussian and Alpha-Stable Modeling

Christos Tzagkarakis; Athanasios Mouchtaris; Panagiotis Tsakalides

This paper describes a novel methodology for automatic musical genre classification based on a feature extraction/statistical similarity measurement approach. First, we perform a 1-D wavelet decomposition of the music signal and we model the resulting subband coefficients using the generalized Gaussian density (GGD) and the alpha-stable distribution. Subsequently, the GGD and alpha-stable distribution parameters are estimated during the feature extraction step, while the similarity between two music signals is measured by employing the Kullback-Leibler divergence (KLD) between their corresponding estimated wavelet distributions. We evaluate the performance of the proposed methodology by using a dataset consisting of six different musical genre sets


european signal processing conference | 2015

3D localization of multiple sound sources with intensity vector estimates in single source zones

Despoina Pavlidi; Symeon Delikaris-Manias; Ville Pulkki; Athanasios Mouchtaris

This work proposes a novel method for 3D direction of arrival (DOA) estimation based on the sound intensity vector estimation, via the encoding of the signals of a spherical microphone array from the space domain to the spherical harmonic domain. The sound intensity vector is estimated on detected single source zones (SSZs), where one source is dominant. A smoothed 2D histogram of these estimates reveals the DOA of the present sources and through an iterative process, accurate 3D DOA information can be obtained. The performance of the proposed method is demonstrated through simulations in various signal-to-noise ratio and reverberation conditions.


workshop on applications of signal processing to audio and acoustics | 2013

Localizing multiple audio sources from DOA estimates in a wireless acoustic sensor network

Anthony Griffin; Athanasios Mouchtaris

In this work we propose a method to estimate the position of multiple sources in a wireless acoustic sensor network, where each sensor node only transmits direction-of-arrival (DOA) estimates each time interval, minimizing the transmissions to the processing node. Our method is based on the intersection of DOA estimates with outlier removal, and as such is very computationally efficient. We explore the performance of our method through extensive simulations and real measurements.

Collaboration


Dive into the Athanasios Mouchtaris's collaboration.

Top Co-Authors

Avatar

Chris Kyriakakis

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Demetrios Cantzos

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

P. Mueller

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Shrikanth Narayanan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge