Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tom Bäckström is active.

Publication


Featured researches published by Tom Bäckström.


Journal of the Acoustical Society of America | 2002

Normalized amplitude quotient for parametrization of the glottal flow

Paavo Alku; Tom Bäckström; Erkki Vilkman

Normalized amplitude quotient (NAQ) is presented as a method to parametrize the glottal closing phase using two amplitude-domain measurements from waveforms estimated by inverse filtering. In this technique, the ratio between the amplitude of the ac flow and the negative peak amplitude of the flow derivative is first computed using the concept of equivalent rectangular pulse, a hypothetical signal located at the instant of the main excitation of the vocal tract. This ratio is then normalized with respect to the length of the fundamental period. Comparison between NAQ and its counterpart among the conventional time-domain parameters, the closing quotient, shows that the proposed parameter is more robust against distortion such as measurement noise that make the extraction of conventional time-based parameters of the glottal flow problematic. Experiments with breathy, normal, and pressed vowels indicate that NAQ is also able to separate the type of phonation effectively.


Folia Phoniatrica Et Logopaedica | 2003

Experiences of a Short Vocal Training Course for Call-Centre Customer Service Advisors

Laura Lehto; Leena Rantala; Erkki Vilkman; Paavo Alku; Tom Bäckström

It is commonly known that occupational voice users suffer from voice symptoms to varying extents. The purpose of this study was to find out the effects of a short (2-day) vocal training course on professional speakers’ voice. The subjects were 38 female and 10 male customer advisors, who mainly use the telephone during their working hours at a call centre. The findings showed that although the subjects did not suffer from severe voice problems, they reported that the short vocal training course had an effect of some of the vocal symptoms they had experienced. More than 50% of the females and males reported a decrease in the feeling of mucus and the consequent need to clear the throat, and diminished worsening of their voice. Over 60% thought that voice training had improved their vocal habits and none reported a negative influence of the course on their voice. Females also reported a reduction of vocal fatigue. The subjects were further asked to respond to 23 statements on how they experienced the voice training in general. The statements ‘I learned things that I didn’t know about the use of voice in general’ and ‘I got useful and important knowledge concerning my work’ were highly assessed by both females and males. The results suggest that even a short vocal training course might affect positively the self-reported well-being of persons working in a vocally loading occupation. However, to find out the long-term effects of a short training course, a follow-up study would need to be carried out.


Logopedics Phoniatrics Vocology | 2005

Voice symptoms of call-centre customer service advisers experienced during a work-day and effects of a short vocal training course

Laura Lehto; Paavo Alku; Tom Bäckström; Erkki Vilkman

Occupational voice users often suffer from voice symptoms to varying extents. The first goal of this study was to find out how telephone customer service advisers experience voice symptoms at different moments of the working day. The second goal was to investigate the effects of a short vocal training course arranged for telephone workers. The results indicate that although the subjects did not suffer from severe voice problems, the short vocal training course significantly reduced some of the vocal symptoms they had experienced. The results suggest that systematic consultation and training for occupational voice users in the field of occupational voice care would be advantageous.


international conference on acoustics, speech, and signal processing | 2015

Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes

Tom Bäckström; Christian Helmrich

Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.


IEEE Signal Processing Letters | 2003

All-pole modeling technique based on weighted sum of LSP polynomials

Tom Bäckström; Paavo Alku

This study presents a new technique called weighted-sum line spectrum pair (WLSP) where an all-pole filter is defined by using a sum of weighted line spectrum pair polynomials. The WLSP yields a stable all-pole filter of order m, whose autocorrelation function coincides with that of the input signal between indices 0 and m-1. By sacrificing the exact matching at index m, the WLSP models the autocorrelation of the input signal at the indices above m more accurately than conventional linear prediction (LP). Experiments with vowels show that, in comparison to the conventional LP, WLSP yields all-pole spectra that model formants with an increased dynamic range between formant peaks and spectral valleys.


conference of the international speech communication association | 2016

Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding.

Tom Bäckström; Florin Ghido; Johannes Fischer

A central part of speech and audio codecs are their perceptual models, which describe the relative perceptual importance of errors in different elements of the signal representation. In practice, the perceptual models consists of signal-dependent weighting factors which are used in quantization of each element. For optimal performance, we would like to use the same perceptual model at the decoder. While the perceptual model is signal-dependent, however, it is not known in advance at the decoder, whereby audio codecs generally transmit this model explicitly, at the cost of increased bit-consumption. In this work we present an alternative method which recovers the perceptual model at the decoder from the transmitted signal without any side-information. The approach will be especially useful in distributed sensor-networks and the Internet of things, where the added cost on bit-consumption from transmitting a perceptual model increases with the number of sensors.


international conference on acoustics, speech, and signal processing | 2003

On the stability of constrained linear predictive models

Tom Bäckström; Paavo Alku

Stability of the all-pole model in conventional, unconstrained linear prediction with the autocorrelation criterion is well known. By exerting constraints to the optimisation problem it is possible to define models of order m + l with m parameters. However, traditionally constraints have led to models whose stability is not guaranteed. In this paper, we discuss constrained linear predictive models where the constraint is one-dimensional (l = 1) and derive stability criteria for these models.


Logopedics Phoniatrics Vocology | 2009

Glottal inverse filtering with the closed-phase covariance analysis utilizing mathematical constraints in modelling of the vocal tract

Paavo Alku; Carlo Magi; Tom Bäckström

Abstract Closed-phase (CP) covariance analysis is a glottal inverse filtering method based on the estimation of the vocal tract with linear prediction (LP) during the closed phase of the vocal fold vibration cycle. Since the closed phase is typically short, the analysis is vulnerable with respect to the extraction of the covariance frame position. The present study proposes a modified CP algorithm based on imposing certain predefined values on the gains of the vocal tract inverse filter at angular frequencies of 0 and π in optimizing filter coefficients. With these constraints, vocal tract models are less prone to show false low-frequency roots. Experiments show that the algorithm improves the robustness of the CP analysis on the covariance frame position.


nordic signal processing symposium | 2006

Objective and Subjective Evaluation of Seven Selected All-Pole Modelling Methods in Processing of Noise Corrupted Speech

Carlo Magi; Tom Bäckström; Paavo Alku

Spectral modeling properties of seven selected all-pole modeling methods were compared by using both objective and subjective tests. Model behavior was evaluated with vowel sounds corrupted by uncorrelated Gaussian and Laplacian background noise. Objective tests were computed with the logarithmic spectral differences (SD2) and subjective speech quality was assessed with the degradation category rating (DCR) listening test. In both tests, the WLPC method, where the weighting function was the short time energy of the speech signal, gave the best results. The correlation between the objective and subjective results was found to be remarkably strong


nordic signal processing symposium | 2006

Harmonic All-Pole Modelling for Glottal Inverse Filtering

Tom Bäckström; Paavo Alku

Glottal inverse filtering is a process, where the acoustic effect of the human vocal tract is removed from the speech signal, to obtain the flow through the vocal folds, the glottal flow. In this article, we present a fully automatic algorithm for inverse filtering, harmonic all-pole inverse filtering, that employs the harmonic structure of speech to obtain the impulse response of both the vocal tract and the glottal flow. The method assumes that the vocal tract and glottal flow waveform can be modelled with minimum- and maximum-phase AR-models, respectively, and that each glottal cycle has a distinct excitation. We present results for male and female speakers for normal, breathy and pressed voices

Collaboration


Dive into the Tom Bäckström's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Erkki Vilkman

Helsinki University Central Hospital

View shared research outputs
Top Co-Authors

Avatar

Johannes Fischer

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Laura Lehto

Helsinki University Central Hospital

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Achim Kuntz

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Carlo Magi

Helsinki University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christian Helmrich

University of Erlangen-Nuremberg

View shared research outputs
Researchain Logo
Decentralizing Knowledge