Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ag Armin Kohlrausch is active.

Publication


Featured researches published by Ag Armin Kohlrausch.


EURASIP Journal on Advances in Signal Processing | 2005

Parametric coding of stereo audio

Dj Jeroen Breebaart; Sljde Steven van de Par; Ag Armin Kohlrausch; Erik Gosuinus Petrus Schuijers

Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional) audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation.


Journal of the Acoustical Society of America | 2001

Binaural processing model based on contralateral inhibition. I. Model structure

Dj Jeroen Breebaart; Sljde Steven van de Par; Ag Armin Kohlrausch

This article presents a quantitative binaural signal detection model which extends the monaural model described by Dau et al. [J. Acoust. Soc. Am. 99, 3615-3622 (1996)]. The model is divided into three stages. The first stage comprises peripheral preprocessing in the right and left monaural channels. The second stage is a binaural processor which produces a time-dependent internal representation of the binaurally presented stimuli. This stage is based on the Jeffress delay line extended with tapped attenuator lines. Through this extension, the internal representation codes both interaural time and intensity differences. In contrast to most present-day models, which are based on excitatory-excitatory interaction, the binaural interaction in the present model is based on contralateral inhibition of ipsilateral signals. The last stage, a central processor, extracts a decision variable that can be used to detect the presence of a signal in a detection task, but could also derive information about the position and the compactness of a sound source. In two accompanying articles, the model predictions are compared with data obtained with human observers in a great variety of experimental conditions.


Attention Perception & Psychophysics | 2008

Audiovisual synchrony and temporal order judgments: Effects of experimental method and stimulus type

Rlj Rob van Eijk; Ag Armin Kohlrausch; James F. Juola; Sljde Steven van de Par

When an audio—visual event is perceived in the natural environment, a physical delay will always occur between the arrival of the leading visual component and that of the trailing auditory component. This natural timing relationship suggests that the point of subjective simultaneity (PSS) should occur at an auditory delay greater than or equal to 0 msec. A review of the literature suggests that PSS estimates derived from a temporal order judgment (TOJ) task differ from those derived from a synchrony judgment (SJ) task, with (unnatural) auditory-leading PSS values reported mainly for the TOJ task. We report data from two stimulus types that differed in terms of complexity— namely, (1) a flash and a click and (2) a bouncing ball and an impact sound. The same participants judged the temporal order and synchrony of both stimulus types, using three experimental methods: (1) a TOJ task with two response categories (“audio first” or “video first”), (2) an SJ task with two response categories (“synchronous” or “asynchronous”; SJ2), and (3) an SJ task with three response categories (“audio first,” “synchronous,” or “video first”; SJ3). Both stimulus types produced correlated PSS estimates with the SJ tasks, but the estimates from the TOJ procedure were uncorrelated with those obtained from the SJ tasks. These results suggest that the SJ task should be preferred over the TOJ task when the primary interest is in perceived audio—visual synchrony.


Journal of the Acoustical Society of America | 1996

A quantitative model of the ''effective'' signal processing in the auditory system. II. Simulations and measurements

Torsten Dau; Dirk Püschel; Ag Armin Kohlrausch

This and the accompanying paper [Dau et al., J. Acoust. Soc. Am. 99, 3615-3622 (1996)] describe a quantitative model for signal processing in the auditory system. The model combines several stages of preprocessing with a decision device that has the properties of an optimal detector. The present paper compares model predictions for a variety of experimental conditions with the performance of human observers. Simulated and psychophysically determined thresholds were estimated with a three-interval forced-choice adaptive procedure. All model parameters were kept constant for all simulations discussed in this paper. For frozen-noise maskers, the effects of the following stimulus parameters were examined: signal frequency, signal phase, temporal position and duration of the signal within the masker under conditions of simultaneous masking, masker level, and masker duration under conditions of forward masking, and backward masking. The influence of signal phase and the temporal position of the signal, including positions at masker onset, was determined for a random-noise masker and compared with corresponding results obtained for a frozen noise. The model describes all the experimental data with an accuracy of a few dB with the following exceptions: forward-masked thresholds obtained with brief maskers are too high and the change in threshold with a change in signal duration is too small. Both discrepancies have their origin in the adaptation stages in the preprocessing part of the model. On the basis of the wide range of simulated conditions we conclude that the present model is a successful approach to describing the detection process in the human auditory system.


Journal of the Acoustical Society of America | 1995

Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets

Ag Armin Kohlrausch; Andres Sander

This article investigates how the amplitude and phase characteristics of the inner ear influence the spectrotemporal representation of harmonic complex sounds. Five experiments are reported, in each of which three sets of maskers are compared that differ only in their phase spectra. The amplitude spectra of the complexes were flat and the phase choices were (a) zero phase, (b) Schroeder phases with a positive sign, and (c) Schroeder phases with a negative sign. In the first four experiments, the spectra contained all harmonics between 200 and 2000 Hz. In experiments 1 and 2, the signal frequency was fixed at 1100 Hz and the fundamental frequency of the maskers was varied. In experiments 3 and 4, the fundamental frequency of the maskers was fixed and the signal frequency varied between 200 and 2000 Hz. In experiments 1 and 3, the signal duration was long compared to the period of the maskers. In experiments 2 and 4, the signal duration was only 5 ms and thresholds were determined for different time points within the maskers period. The results show a strong correlation between the minima of the short signals thresholds and the threshold of the long signal. In experiment 5, the spectral extent of the masker was shifted to values one octave lower (100 to 1000 Hz) or one or two octaves higher (400 to 4000 Hz and 800 to 8000 Hz, respectively). For each spectral region, masked thresholds of a long signal were obtained for three values of the fundamental frequency. In all five experiments the thresholds depended very much on the specific phase choices with differences of up to 25 dB. The masker with a negative Schroeder phase always led to the highest thresholds. The thresholds of the masker with a positive Schroeder phase, on the other hand, were for a wide range of parameters lower than the thresholds for the zero-phase masker. These phase effects are most likely caused by the phase characteristic of the basilar-membrane filter, which affects the flat envelopes of the two Schroeder-phase maskers in a very different way. For an appropriate choice of parameters, one of the two becomes even more strongly modulated than the zero-phase complex. This latter observation imposes some restrictions on the second derivative (curvature) of the phase-versus-frequency relation for the auditory filters.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End

Tobias May; Sljde Steven van de Par; Ag Armin Kohlrausch

Although extensive research has been done in the field of machine-based localization, the degrading effect of reverberation and the presence of multiple sources on localization performance has remained a major problem. Motivated by the ability of the human auditory system to robustly analyze complex acoustic scenes, the associated peripheral stage is used in this paper as a front-end to estimate the azimuth of sound sources based on binaural signals. One classical approach to localize an acoustic source in the horizontal plane is to estimate the interaural time difference (ITD) between both ears by searching for the maximum in the cross-correlation function. Apart from ITDs, the interaural level difference (ILD) can contribute to localization, especially at higher frequencies where the wavelength becomes smaller than the diameter of the head, leading to ambiguous ITD information. The interdependency of ITD and ILD on azimuth is a complex pattern that depends also on the room acoustics, and is therefore learned by azimuth-dependent Gaussian mixture models (GMMs). Multiconditional training is performed to take into account the variability of the binaural features which results from multiple sources and the effect of reverberation. The proposed localization model outperforms state-of-the-art localization techniques in simulated adverse acoustic conditions.


EURASIP Journal on Advances in Signal Processing | 2005

A perceptual model for sinusoidal audio coding based on spectral integration

Steven van de Par; Ag Armin Kohlrausch; Richard Heusdens; Jesper Jensen; Søren Holdt Jensen

Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.


Journal of the Acoustical Society of America | 1996

Intrinsic envelope fluctuations and modulation‐detection thresholds for narrow‐band noise carriers.

Torsten Dau; Jesko L. Verhey; Ag Armin Kohlrausch

A model is presented which calculates the intrinsic envelope power of a bandpass noise carrier within the passband of a hypothetical modulation filter tuned to a specific modulation frequency. Model predictions are compared to experimentally obtained amplitude modulation (AM) detection thresholds. In experiment 1, thresholds for modulation rates of 5, 25, and 100 Hz imposed on a bandpass Gaussian noise carrier with a fixed upper cutoff frequency of 6 kHz and a bandwidth in the range from 1 to 6000 Hz were obtained. In experiment 2, three noises with different spectra of the intrinsic fluctuations served as the carrier: Gaussian noise, multiplied noise, and low-noise noise. In each case, the carrier was spectrally centered at 5 kHz and had a bandwidth of 50 Hz. The AM detection thresholds were obtained for modulation frequencies of 10, 20, 30, 50, 70, and 100 Hz. The intrinsic envelope power of the carrier at the output of the modulation filter tuned to the signal modulation frequency appears to provide a good estimate for AM detection threshold. The results are compared with predictions on the basis of the more complex auditory processing model by Dau et al.


Journal of the Acoustical Society of America | 1986

Phase effects in masking related to dispersion in the inner ear

Bennett K. Smith; Ulrich K. Sieben; Ag Armin Kohlrausch; Manfred R. Schroeder

Phase effects in masking experiments using multitone maskers are usually associated with strong variations in the masker envelope. In this article, psychoacoustic experiments with such maskers that lead to phase-dependent threshold variations of up to 20 dB, although the phase transformation leaves the envelope unchanged, are described. However, after filtering the maskers with a realistic basilar membrane model, the envelopes are different owing to the models phase-dispersive properties. Comparison of model outputs with the experimental results reveals a strong correlation between the two for a wide range of parameters, provided one makes the additional assumption that the ear has a minimum integration time of a few milliseconds.


international conference on acoustics, speech, and signal processing | 2002

A new psychoacoustical masking model for audio coding applications

Steven van de Par; Ag Armin Kohlrausch; Ghassan Charestan; Richard Heusdens

The use of psychoacoustical masking models for audio coding applications has been wide spread over the past decades. In such applications, it is typically assumed that the original input signal serves as a masker for the distortions that are introduced by the lossy coding method that is used. Such masking models are based on the peripheral bandpass filtering properties of the auditory system and basically evaluate the distortion-to-masker ratio within each auditory filter. Up to now these models have been based on the assumption that the masking of distortions is governed by the auditory filter for which the ratio between distortion and masker is largest. This assumption, however, is not in line with some new findings within the field of psychoacoustics. A more accurate assumption would be that the human auditory system is able to integrate distortions that are present within a range of auditory filters. In this contribution a new model is presented which is in line with new psychoacoustical studies and which is suitable for application within an audio codec. Although this model can be used to derive a masking curve, the model also gives a measure for the detectability of distortions provided that distortions are not too large.

Collaboration


Dive into the Ag Armin Kohlrausch's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Torsten Dau

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar

Dj Dik Hermes

Eindhoven University of Technology

View shared research outputs
Top Co-Authors

Avatar

James F. Juola

Eindhoven University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge