Ralf Geiger
Fraunhofer Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ralf Geiger.
international conference on acoustics, speech, and signal processing | 2009
Max Neuendorf; Philippe Gournay; Markus Multrus; Jérémie Lecomte; Bruno Bessette; Ralf Geiger; Stefan Bayer; Guillaume Fuchs; Johannes Hilpert; Redwan Salami; Gerald Schuller; Roch Lefebvre; Bernhard Grill
Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.
international conference on acoustics, speech, and signal processing | 2002
Ralf Geiger; Jürgen Herre; Jürgen Koller; Karlheinz Brandenburg
The Modified Discrete Cosine Transform (MDCT) is widely used in modem perceptual audio coding schemes. In this paper we present an integer approximation of this lapped transform, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transform inherits most of the attractive properties of the MDCT, exhibiting a good spectral representation of the audio signal, critical sampling and overlapping of blocks. This makes the IntMDCT well suited for both lossless audio coding as well as for combined perceptual and lossless audio coding. A scalable system is presented providing a lossless enhancement of perceptual audio coding schemes, such as MPEG-2 AAC.
international conference on acoustics, speech, and signal processing | 2003
Ralf Geiger; A. Herre; Gerald Schuller; Thomas Sporer
This papers presents an embedded fine grain scalable perceptual and lossless audio coding scheme. The enabling technology for this combined perceptual and lossless audio coding approach is the integer modified discrete cosine transform (IntMDCT), which is an integer approximation of the MDCT based on the lifting scheme. It maintains the perfect reconstruction property and therefore enables efficient lossless coding in the frequency domain. The close approximation of the MDCT also allows us to build a perceptual coding scheme based on the IntMDCT. In this paper a bitsliced arithmetic coding technique is applied to the IntMDCT values. Together with the encoded shape of the masking threshold a perceptually hierarchical bitstream is obtained, containing several stages of perceptual quality and extending to lossless operation when transmitted completely. A concept of encoding subslices is presented in order to obtain a fine adaptation to the masking threshold especially in the range of perceptually transparent quality.
asilomar conference on signals, systems and computers | 2003
Ralf Geiger; Yoshikazu Yokotani; Gerald Schuller
Lifting scheme based integer transforms are very powerful tools to construct lossless coding schemes. These transforms such as the integer fast fourier transform (IntFFT) and the integer modified discrete cosine transform (IntMDCT) are integer approximations of the original floatingpoint transforms, and hence there is an approximation error in the transform domain. This paper will propose structures for improved integer transforms in terms of improved approximation accuracy and computational efficiency. Experimental results will show that clear improvements in these two points are achieved in lossless audio coding.
Journal of the Acoustical Society of America | 2008
Ralf Geiger; Thomas Sporer; Karlheinz Brandenburg; Juergen Herre; Juergen Koller; Joachim Deguara
A time-discrete audio signal is processed to provide a quantization block with quantized spectral values. Furthermore, an integer spectral representation is generated from the time-discrete audio signal using an integer transform algorithm. The quantization block having been generated using a psychoacoustic model is inversely quantized and rounded to then form a difference between the integer spectral values and the inversely quantized rounded spectral values. The quantization block alone provides a lossy psychoacoustically coded/decoded audio signal after the decoding, whereas the quantization block, together with the combination block, provides a lossless or almost lossless coded and again decoded audio signal in the decoding. By generating the differential signal in the frequency domain, a simpler coder/decoder structure results.
asilomar conference on signals, systems and computers | 2002
Ralf Geiger; Gerald Schuller
Recently, lifting-based integer approximations of filter banks have received much attention, especially in the field of image coding. The application of the techniques to cosine modulated filter banks for audio coding, including not only the modified discrete Fourier transform (MDCT) but also low delay filter banks are focused on. Applications of the integer filter banks include lossless audio coding and backward compatible lossless enhancement of MDCT-based perceptual audio coding schemes, such as MPEG-2/4 AAC.
international conference on acoustics, speech, and signal processing | 2006
Ralf Geiger; Yoshikazu Yokotani; Gerald Schuller
This paper describes high data-rate audio data hiding using the IntMDCT. The IntMDCT is an integer approximation of the MDCT with perfect reconstruction. Based on this transform, we describe a straight-forward way to embed data and extract it in a bit-exact manner while perceptual transparency is maintained. Since the IntMDCT spectrum can be used to obtain a closer approximation of the masking threshold compared to time domain approaches, it is possible to achieve higher data rates. In a simple experimental implementation, we found we could embed data at rates up to about 140 kb/s without introducing audible distortions
IEEE Transactions on Audio, Speech, and Language Processing | 2006
Yoshikazu Yokotani; Ralf Geiger; Gerald Schuller; Soontorn Oraintara; K. R. Rao
In this paper, lossless audio coding using the integer modified discrete cosine transform (IntMDCT) is discussed. The IntMDCT is constructed as an integer approximation of the MDCT using the lifting scheme and is reversible. The rounding error shape of the IntMDCT is derived. When the spectral energy of the input audio signal is concentrated at the low frequencies, the rounding error spectrum limits the lossless coding performance. A method for shaping the rounding error in the transform domain is presented. This rounding error shaping scheme manipulates the error so that it is below the spectral envelope of the signal at the high frequencies in order to improve the lossless coding performance for the signal. Examples of an error shaping filter design are presented and verified by simulations. An IntMDCT-based lossless coding implementation is carried out to illustrate the use of the error shaping filters
international conference on digital signal processing | 2004
Yoshikazu Yokotani; Ralf Geiger; Gerald Schuller; Soontorn Oraintara; K. R. Rao
This paper discusses approximation noise shaping to improve the efficiency of the integer modified discrete cosine transform (IntMDCT)-based lossless audio codec. The scheme is applied to rounding operations associated with lifting steps to shape the noise spectrum towards the low frequency bands. In this paper, constraints on the noise shaping filter and a design procedure with the constraints are discussed. Several noise shaping filters are designed and experimental results showing the improvement are presented.
workshop on applications of signal processing to audio and acoustics | 2007
Markus Schnell; Ralf Geiger; Markus Schmidt; Markus Multrus; Michael Mellar; Jürgen Herre; Gerald Schuller
Low delay perceptual audio coding has recently gained wide acceptance for high quality communication. While common schemes are based on the well-known Modified Discrete Cosine Transform (MDCT) filterbank, this paper describes novel coding algorithms that, for the first time, make use of dedicated low delay filterbanks, thus achieving improved coding efficiency while maintaining or even reducing the low codec delay. The MPEG-4 Enhanced Low Delay AAC (AAC-ELD) coder currently under development within ISO/MPEG combines a traditional perceptual audio coding scheme with spectral band replication (SBR), both running in a delay-optimized fashion by using low delay filterbanks.