Pavlos Papadopoulos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pavlos Papadopoulos is active.

Explore More

Publication

Featured researches published by Pavlos Papadopoulos.

international conference on acoustics, speech, and signal processing | 2014

A supervised signal-to-noise ratio estimation of speech signals

Pavlos Papadopoulos; Andreas Tsiartas; James Gibson; Shrikanth Narayanan

This paper introduces a supervised statistical framework for estimating the signal-to-noise (SNR) ratio of speech signals. Information on how noise corrupts a signal can help us compensate for its effects, especially in real life applications where the usual assumption of white Gaussian noise does not hold and speech boundaries in the signal are not known. We use features from which we can detect speech regions in a signal, without using Voice Activity Detection, and estimate the energies of those regions. Then we use these features to train ordinary least squares regression models for various noise types. We compare this supervised method with state-of-the-art SNR estimation algorithms and show its superior performance with respect to the tested noise types.

IEEE Transactions on Audio, Speech, and Language Processing | 2016

Long-Term SNR Estimation of Speech Signals in Known and Unknown Channel Conditions

Pavlos Papadopoulos; Andreas Tsiartas; Shrikanth Narayanan

Many speech processing algorithms and applications rely on the explicit knowledge of signal-to-noise ratio (SNR) in their design and implementation. Estimating the SNR of a signal can enhance the performance of such technologies. We propose a novel method for estimating the long-term SNR of speech signals based on features, from which we can approximately detect regions of speech presence in a noisy signal. By measuring the energy in these regions, we create sets of energy ratios, from which we train regression models for different types of noise. If the type of noise that corrupts a signal is known, we use the corresponding regression model to estimate the SNR. When the noise is unknown, we use a deep neural network to find the “closest” regression model to estimate the SNR. Evaluations were done based on the TIMIT speech corpus, using noises from the NOISEX-92 noise database. Furthermore, we performed cross-corpora experiments by training on TIMIT and NOISEX-92 and testing on the Wall Street Journal speech corpus and DEMAND noise database. Our results show that our system provides accurate SNR estimations across different noise types, corpora, and that it outperforms other SNR estimation methods.

conference of the international speech communication association | 2016

Automatic Estimation of Perceived Sincerity from Spoken Language.

Brandon M. Booth; Rahul Gupta; Pavlos Papadopoulos; Ruchir Travadi; Shrikanth Narayanan

Sincerity is important in everyday human communication and perception of genuineness can greatly affect emotions and outcomes in social interactions. In this paper, submitted for the INTERSPEECH 2016 Sincerity Challenge, we examine a corpus of six different types of apologetic utterances from a variety of English speakers articulated in different prosodic styles, and we rate the sincerity of each remark. Since the utterances and semantic meaning in the examined database are controlled, we focus on tone of voice by exploring a plethora of acoustic and paralinguistic features not present in the baseline model and how well they contribute to human assessment of sincerity. We show that these additional features improve the performance using the baseline model, and furthermore that conditioning learning models on the prosody of utterances boosts the prediction accuracy. Our best system outperforms the challenge baseline and in principle can generalize well to other corpora.

conference of the international speech communication association | 2016

Noise Aware and Combined Noise Models for Speech Denoising in Unknown Noise Conditions.

Pavlos Papadopoulos; Colin Vaz; Shrikanth Narayanan

Traditional denoising schemes require prior knowledge or statistics of the noise corrupting the signal, or estimate the noise from noise-only portions of the signal, which requires knowledge of speech boundaries. Extending denoising methods to perform well in unknown noise conditions can facilitate processing of data captured in different real life environments, and relax rigid data acquisition protocols. In this paper we propose two methods for denoising speech signals in unknown noise conditions. The first method has two stages. In the first stage we use Long Term Signal Variability features to decide which noise model to use from a pool of available models. Once we determine the noise type, we use Nonnegative Matrix Factorization with a dictionary trained on that noise to denoise the signal. In the second method, we create a combined noise dictionary from different types of noise, and use that dictionary in the denoising phase. Both of our systems improve signal quality, as measured by PESQ scores, for all the noise types we tested, and for different Signal to Noise Ratio levels.

international conference on acoustics, speech, and signal processing | 2010

Identification of linear systems in canonical form through an EM framework

Pavlos Papadopoulos; Vassilios Digalakis

Least-squares estimation has always been the main approach when applying prediction error methods (PEM) in the identification of linear dynamical systems. Regardless of the estimation algorithm, if there are no restrictions on the form of the matrices we want to estimate, the matrices can be determined up to within a linear transformation and thus the result may be different than the true solution and the convergence of iterative algorithms may be affected. In this paper, we apply a new identification procedure based on the Expectation Maximization framework to a family of identifiable state-space models. To our knowledge, this is the first complete solution of Maximum-Likelihood estimation for general linear state-space models.

conference of the international speech communication association | 2015

Automated evaluation of non-native English pronunciation quality: combining knowledge- and data-driven features at multiple time scales

Matthew P. Black; Daniel Bone; Zisis Iason Skordilis; Rahul Gupta; Wei Xia; Pavlos Papadopoulos; Sandeep Nallan Chakravarthula; Bo Xiao; Maarten Van Segbroeck; Jangwon Kim; Panayiotis G. Georgiou; Shrikanth Narayanan

conference of the international speech communication association | 2017