Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Søren Vang Andersen is active.

Publication


Featured researches published by Søren Vang Andersen.


Speech Coding, 2002, IEEE Workshop Proceedings. | 2002

iLBC - a linear predictive coder with robustness to packet losses

Søren Vang Andersen; Willem Bastiaan Kleijn; Roar Hagen; Jan Linden; Manohar N. Murthi; J Skoglund

In this paper, we discuss the internet low bit rate codec (iLBC) with an emphasis on the frame-independent long-term prediction. The frame-independent long-term prediction is a method to exploit pitch-lag correlations in the encoding of speech without suffering multiple-frame speech degradation in connection with transmission loss. We present mean opinion scores for the iLBC codec and show by means of signal examples how the nature of degradation in a predictive codec based on frame-independent long-term prediction differs from that of traditional CELP codecs.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

Hidden Markov model-based packet loss concealment for voice over IP

Christoffer A. Rødbro; Manohar N. Murthi; Søren Vang Andersen; Søren Holdt Jensen

As voice over IP proliferates, packet loss concealment (PLC) at the receiver has emerged as an important factor in determining voice quality of service. Through the use of heuristic variations of signal and parameter repetition and overlap-add interpolation to handle packet loss, conventional PLC systems largely ignore the dynamics of the statistical evolution of the speech signal, possibly leading to perceptually annoying artifacts. To address this problem, we propose the use of hidden Markov models for PLC. With a hidden Markov model (HMM) tracking the evolution of speech signal parameters, we demonstrate how PLC is performed within a statistical signal processing framework. Moreover, we show how the HMM is used to index a specially designed PLC module for the particular signal context, leading to signal-contingent PLC. Simulation examples, objective tests, and subjective listening tests are provided showing the ability of an HMM-based PLC built with a sinusoidal analysis/synthesis model to provide better loss concealment than a conventional PLC based on the same sinusoidal model for all types of speech signals, including onsets and signal transitions


international conference on acoustics, speech, and signal processing | 2000

On the mutual information between frequency bands in speech

Mattias Nilsson; Søren Vang Andersen; W.B. Kleijn

In this paper we investigate the mutual information in speech between the spectral envelope of the high frequency band and low frequency bands of various widths. Direct methods on the computation of the mutual information often result in an excessive amount of data required even for modest situations. We reduce the required amount of data by quantizing the low band leading to a lower bound expression on the mutual information. We indicate by simulation that this lower bound is in the same order of magnitude as the true mutual information. Simulations on speech show that we have no less than 0.1 bit of shared information between the slope of the high band and the low frequency band from 0-4 kHz. Performing the analogous simulation with the gain of the high band we obtained no less than 0.45 bit of mutual information.


EURASIP Journal on Advances in Signal Processing | 2005

Speech enhancement with natural sounding residual noise based on connected time-frequency speech presence regions

Karsten Vandborg Sørensen; Søren Vang Andersen

We propose time-frequency domain methods for noise estimation and speech enhancement. A speech presence detection method is used to find connected time-frequency regions of speech presence. These regions are used by a noise estimation method and both the speech presence decisions and the noise estimate are used in the speech enhancement method. Different attenuation rules are applied to regions with and without speech presence to achieve enhanced speech with natural sounding attenuated background noise. The proposed speech enhancement method has a computational complexity, which makes it feasible for application in hearing aids. An informal listening test shows that the proposed speech enhancement method has significantly higher mean opinion scores than minimum mean-square error log-spectral amplitude (MMSE-LSA) and decision-directed MMSE-LSA.


international conference on acoustics, speech, and signal processing | 2000

Exploiting time and frequency masking in consistent sinusoidal analysis-synthesis

Renat Vafin; Søren Vang Andersen; W.B. Kleijn

In this paper, we elaborate on the issue of analysis-synthesis consistency in sinusoidal coding. Our analysis is based on windowed sinusoids, and uses the same amplitude-complementary window as is used in the overlap-add synthesis. Reconstructions of the neighboring segments are taken into account when forming a particular analysis segment. Sinusoidal estimation is based on a perceptual criterion. In our new procedure, when analyzing the current segment we take advantage of the forward masking effect due to estimated sinusoids in the previous segments (possibly overlapping with the current segment). Experimental results verify that the number of sinusoids can be reduced significantly with our time masking model, without introducing perceptual artifacts in the reconstructed signal.


EURASIP Journal on Advances in Signal Processing | 2005

A block-based linear MMSE noise reduction with a high temporal resolution modeling of the speech excitation

Chunjian Li; Søren Vang Andersen

A comprehensive linear minimum mean squared error (LMMSE) approach for parametric speech enhancement is developed. The proposed algorithms aim at joint LMMSE estimation of signal power spectra and phase spectra, as well as exploitation of correlation between spectral components. The major cause of this interfrequency correlation is shown to be the prominent temporal power localization in the excitation of voiced speech. LMMSE estimators in time domain and frequency domain are first formulated. To obtain the joint estimator, we model the spectral signal covariance matrix as a full covariance matrix instead of a diagonal covariance matrix as is the case in the Wiener filter derived under the quasi-stationarity assumption. To accomplish this, we decompose the signal covariance matrix into a synthesis filter matrix and an excitation matrix. The synthesis filter matrix is built from estimates of the all-pole model coefficients, and the excitation matrix is built from estimates of the instantaneous power of the excitation sequence. A decision-directed power spectral subtraction method and a modified multipulse linear predictive coding (MPLPC) method are used in these estimations, respectively. The spectral domain formulation of the LMMSE estimator reveals important insight in interfrequency correlations. This is exploited to significantly reduce computational complexity of the estimator. For resource-limited applications such as hearing aids, the performance-to-complexity trade-off can be conveniently adjusted by tuning the number of spectral components to be included in the estimate of each component. Experiments show that the proposed algorithm is able to reduce more noise than a number of other approaches selected from the state of the art. The proposed algorithm improves the segmental SNR of the noisy signal by 13 dB for the white noise case with an input SNR of 0 dB.


international conference on knowledge-based and intelligent information and engineering systems | 2003

Amplitude modulated sinusoidal models for audio modeling and coding

Mads Græsbøll Christensen; Søren Vang Andersen; Søren Holdt Jensen

In this paper a new perspective on modeling of transient phenomena in the context of sinusoidal audio modeling and coding is presented. In our approach the task of finding time-varying amplitudes for sinusoidal models is viewed as an AM demodulation problem. A general perfect reconstruction framework for amplitude modulated sinusoids is introduced and model reductions lead to a model for audio compression. Demodulation methods are considered for estimation of the time-varying amplitudes, and inherent constraints and limitations axe discussed. Finally, some applications axe considered and discussed and the concepts are demonstrated to improve sinusoidal modeling of audio and speech.


ieee workshop on speech coding for telecommunications | 1997

Quantization noise modeling in low-delay speech coding

Søren Vang Andersen; Søren Holdt Jensen; Egon Bech Hansen

We present a low-delay vector predictive transform coder with quantization noise modeling. The coder uses overlapped transform coding and a Kalman filter with a backward estimated LPC model of the speech signal combined with an additive noise model of the quantization noise. The noise model is driven by the vector quantizer. Each quantizer cell has its own error correlation matrix, which is kept in a table and addressed using the quantizer index. Simulation results indicate that overlapped transform coding combined with noise modeling can improve the quality of the decoded speech signal.


global communications conference | 2012

Improving QoE for Skype video call in Mobile Broadband Network

Jing Zhu; Rath Vannithamby; Christoffer A. Rødbro; Mingyu Chen; Søren Vang Andersen

In this paper, we introduce an app-radio cross-layer framework for improving Quality of user Experience (QoE) of Over-The-Top (OTT) Internet applications in Mobile Broadband Networks, e.g. WiMAX, 3G, LTE, etc. We apply methods similar to the well-known network layer techniques: Explicit Congestion Notification (ECN) and Differentiated Services (DiffServ) to application layer and wireless link layer. We focus on UDP based delay-sensitive real-time video call applications such as Skype, and propose to exchange cross-layer information, e.g. congestion, packet priority, etc., through API (application programming interface) based control signals instead of the existing ECN and DiffServ fields in IP data packet header. We built a prototype system with Microsoft Windows XP OS and Intel Mobile WiMAX network adapter, and use POLQA Mean Opinion Score (MOS) to measure the audio QoE for the video call. Empirical results show that immediate congestion indication effectively speeds up Skype response to bandwidth variation, and intra-flow prioritization further reduces the delay of audio packets. Both significantly improve the audio MOS of a video call.


international conference on acoustics, speech, and signal processing | 2009

On GMM Kalman predictive coding of LSFS for packet loss

Shaminda Subasingha; Manohar N. Murthi; Søren Vang Andersen

Gaussian Mixture Model (GMM)-based Kalman predictive coders have been shown to perform better than baselineGMM Recursive Coders in predictive coding of Line Spectral Frequencies (LSFs) for both clean and packet loss conditions However, these stationary GMM Kalman predictive coders were not specifically designed for operation in packet loss conditions. In this paper, we demonstrate an approach to the the design of GMM-based predictive coding for packet loss channels. In particular, we show how a stationary GMM Kalman predictive coder can be modified to obtain a set of encoding and decoding modes, each with different Kalman gains. This approach leads to more robust performance of predictive coding of LSFs in packet loss conditions, as the coder mismatch between the encoder and decoder are minimized. Simulation results show that this Robust GMM Kalman predictive coder performs better than other baseline GMM predictive coders with no increase in complexity. To the best of our knowledge, no previous work has specifically examined the design of GMM predictive coders for packet loss conditions.

Collaboration


Dive into the Søren Vang Andersen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Egon Bech Hansen

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge