Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kari Jarvinen is active.

Publication


Featured researches published by Kari Jarvinen.


IEEE Transactions on Speech and Audio Processing | 2002

The adaptive multirate wideband speech codec (AMR-WB)

Bruno Bessette; Redwan Salami; Roch Lefebvre; Milan Jelinek; Jani Rotola-Pukkila; Janne Vainio; Hannu Mikkola; Kari Jarvinen

This paper describes the adaptive multirate wideband (AMR-WB) speech codec selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The AMR-WB speech codec algorithm was selected in December 2000 and the corresponding specifications were approved in March 2001. The AMR-WB codec was also selected by the International Telecommunication Union-Telecommunication Sector (ITU-T) in July 2001 in the standardization activity for wideband speech coding around 16 kb/s and was approved in January 2002 as Recommendation G.722.2. The adoption of AMR-WB by ITU-T is of significant importance since for the first time the same codec is adopted for wireless as well as wireline services. AMR-WB uses an extended audio bandwidth from 50 Hz to 7 kHz and gives superior speech quality and voice naturalness compared to existing second- and third-generation mobile communication systems. The wideband speech service provided by the AMR-WB codec will give mobile communication speech quality that also substantially exceeds (narrowband) wireline quality. The paper details AMR-WB standardization history, algorithmic description including novel techniques for efficient ACELP wideband speech coding and subjective quality performance of the codec.


international conference on acoustics, speech, and signal processing | 1997

GSM enhanced full rate speech codec

Kari Jarvinen; Janne Vainio; Pekka Kapanen; Tero Honkanen; Petri Haavisto; Redwan Salami; Claude Laflamme; Jean-Pierre Adoul

This paper describes the GSM enhanced full rate (EFR) speech codec that has been standardised for the GSM mobile communication system. The GSM EFR codec has been jointly developed by Nokia and University of Sherbrooke. It provides speech quality at least equivalent to that of a wireline telephony reference (32 kbit/s ADPCM). The EFR codec uses 12.2 kbit/s for speech coding and 10.6 kbit/s for error protection. Speech coding is based on the ACELP algorithm (algebraic code excited linear prediction). The codec provides substantial quality improvement compared to the existing GSM full rate and half rate codecs. The old GSM codecs lack wireline quality even in error-free channel conditions, while the EFR codec provides wireline quality not only for error-free conditions but also for the most typical error conditions. With the EFR codec, wireline quality is also sustained in the presence of background noise and in tandem connections (mobile to mobile calls).


Journal of the Acoustical Society of America | 1995

Pulse pattern excited linear prediction voice coder

Jari Haggvist; Kari Jarvinen; Kari-Pekka Estola; Jukka Ranta

Speech coding of the code excited linear predictive type is implemented by providing an excitation vector which comprises a set of a pre-determined number of pulse patterns from a codebook of P pulse patterns, which have a selected orientation and a pre-determined delay with respect to the starting point of the excitation vector. This requires modest computational power and a small memory space, which allows it to be implemented in one signal processor.


Computer Communications | 2010

Media coding for the next generation mobile system LTE

Kari Jarvinen; Imed Bouazizi; Lasse Laaksonen; Pasi Ojala; Anssi Rämö

Introduction of LTE (Long Term Evolution) brings enhanced quality for 3GPP multimedia services. The high throughput and low latency of LTE enable higher quality media coding than what is possible in UMTS. LTE-specific codecs have not yet been defined but work on them is ongoing in 3GPP. The LTE codecs are expected to improve the basic signal quality, but also to offer new capabilities such as extended audio bandwidth, stereo and multi-channels for voice and higher temporal and spatial resolutions for video. Due to the wide range of functionalities in media coding, LTE gives more flexibility for service provision to cope with heterogeneous terminal capabilities and transmission over heterogeneous network conditions. By adjusting the bit-rate, the computational complexity, and the spatial and temporal resolution of audio and video, transport and rendering can be optimised throughout the media path hence guaranteeing the best possible quality of service.


Journal of the Acoustical Society of America | 2000

Speech synthesizer employing post-processing for enhancing the quality of the synthesized speech

Kari Jarvinen; Tero Honkanen

A post-processor 317 and method substantially for enhancing synthesised speech is disclosed. The post-processor 317 operates on a signal ex(n) derived from an excitation generator 211 typically comprising a fixed code book 203 and an adaptive code book 204, the signal ex(n) being formed from the addition of scaled outputs from the fixed code book 203 and adaptive code book 204. The post-processor operates on ex(n) by adding to it a scaled signal pv(n) derived from the adaptive code book 204. A gain or scale factor p is determined by the speech coefficients input to the excitation generator 211. The combined signal ex(n)+pv(n) is normalised by unit 316 and input to an LPC or speech synthesis filter 208, prior to being input to an audio processing unit 209.


Journal of the Acoustical Society of America | 1997

Methods and apparatus for coding a speech signal using variable order filtering

Kari Jarvinen; Olli Ali-Yrkko

The method concerns digital coding of a speech signal. The method is based on the use of a model of speech production comprising an excitation and shaping of the excitation in a filtering operation in such a manner that the order of the filtering which models the shaping of the excitation signal occurring in the vocal tract is adapted according to the speech signal to be coded. By means of the method it is possible to achieve a total modelling for the speech signal--and thus efficient speech coding--which is better than methods using fixed-order, model-based filtering of the speech tract. From the standpoint of the efficiency of the coding, by decreasing a needlessly large order of the filtering method, the bit rate to be used for coding the excitation signal can be increased or the bit rate resources thus freed up can be allocated for use in the error correction coding. On the other hand, the order of the filtering operation modelling the vocal tract can if necessary be increased if this is of essential benefit in the coding, and correspondingly, the bit rate to be used in coding the excitation signal can be lowered.


Journal of the Acoustical Society of America | 1997

Digital coding of speech signals using analysis filtering and synthesis filtering

Kari Jarvinen

A digital speech encoder is constructed to include a short term analyzer for forming a set of prediction parameters a(i), corresponding to an input speech signal, and an encoder for producing an excitation signal. The encoder includes a plurality of serially coupled coding blocks, wherein each coding block includes an analysis filter, a sample selection block, and a synthesizer filter. The analysis filter outputs speech signal sample values to the sample selection block, which selects and outputs Ki sample values representing a selected partial excitation signal. The synthesis filter synthesizes a speech signal corresponding to the selected partial excitation signal output by the selection block and outputs a partial excitation synthesis result to an output of the coding block. At the output of each coding block is a subtractor arranged for subtracting a partial excitation synthesis result that is output from the coding block from the speech signal to obtain a difference signal. The difference signal is coupled to the input of an analysis filter of a next serially coupled coding block. A quantizer is also provided for forming the excitation signal in accordance with all of the partial excitation signals generated by the coding blocks.


international conference on acoustics speech and signal processing | 1998

GSM EFR based multi-rate codec family

Janne Vainio; Hannu Mikkola; Kari Jarvinen; Petri Haavisto

This paper describes a multi-rate codec family developed as a potential candidate for the GSM adaptive multi-rate (AMR) codec standard. The codec family consists of the GSM enhanced full rate (EFR) codec and lower bit-rate extensions thereof. The codec family consists of several codecs, i.e., modes that have different bit-rate partitionings between source coding and error protection. All the source codecs use the same ACELP-method (algebraic code excited linear predictive coding) used also in the GSM EFR codec. The codec operates at gross bit-rates of 22.8 kbit/s in the GSM full rate (FR) channel and 11.4 kbit/s in the GSM half rate (HR) channel. In the full rate channel, the codec provides improved error robustness over the GSM enhanced full rate (EFR) codec. It extends wireline quality (equal to or better than G.726-32 ADPCM) to poor channel error conditions with low C/I-ratios of 7 dB or even below. When operated in the half rate channel, the codec provides improved channel capacity while still providing wireline quality at high C/I-ratios above 16-19 dB.


Journal of the Acoustical Society of America | 1996

Method and apparatus for decoding LPC-encoded speech using a median filter modification of LPC filter factors to compensate for transmission errors

Pekka Kapanen; Kari Jarvinen

Disclosed herein are methods and apparatus for improving the quality of synthesized speech that is transmitted through a channel that is susceptible to transmission errors. In a presently preferred embodiment of the invention a speech signal is assumed to be first encoded using a Linear Predictive Coding (LPC) technique prior to transmission. The parameters that describe the short-term spectral behavior of the speech signal are received and then applied to and processed by a non-linear median processing block only on an occurrence of a predetermined number of transmission errors in the received LPC speech signal. The median-processed short term speech parameters are subsequently employed, together with a received excitation signal, in a synthesis filter to synthesize a speech signal of improved quality over what would be obtained if the short term speech parameters were not median processed to compensate for the transmission errors.


international conference on acoustics, speech, and signal processing | 2015

Standardization of the new 3GPP EVS codec

Stefan Bruhn; Harald Pobloth; M. Schnell; B. Grill; Jon Gibbs; Lei Miao; Kari Jarvinen; Lasse Laaksonen; Noboru Harada; Nobuhiko Naka; Stephane Ragot; Stéphane Proust; T. Sanda; Imre Varga; C. Greer; Milan Jelinek; M. Xie; Paolo Usai

A new codec for Enhanced Voice Services (EVS), the successor of the current mobile HD voice codec AMR-WB, was standardized by the 3rd Generation Partnership Project (3GPP) in September 2014. The EVS codec addresses 3GPPs needs for cutting-edge technology enabling operation of 3GPP mobile communication systems in the most competitive means in terms of communication quality and efficiency. This paper provides an in-depth insight into 3GPPs rigorous and transparent processes that made it possible for the mobile industry, with its many competing players, to successfully develop and standardize a codec in an open, fair and constructive process. This paper also enables an understanding of this achievement by providing an overview of the EVS codec technology, the standard specifications, and the performance of the codec that will elevate HD voice services to the next quality level.

Collaboration


Dive into the Kari Jarvinen's collaboration.

Researchain Logo
Decentralizing Knowledge