Pekka Kapanen
Nokia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pekka Kapanen.
international conference on acoustics, speech, and signal processing | 1997
Kari Jarvinen; Janne Vainio; Pekka Kapanen; Tero Honkanen; Petri Haavisto; Redwan Salami; Claude Laflamme; Jean-Pierre Adoul
This paper describes the GSM enhanced full rate (EFR) speech codec that has been standardised for the GSM mobile communication system. The GSM EFR codec has been jointly developed by Nokia and University of Sherbrooke. It provides speech quality at least equivalent to that of a wireline telephony reference (32 kbit/s ADPCM). The EFR codec uses 12.2 kbit/s for speech coding and 10.6 kbit/s for error protection. Speech coding is based on the ACELP algorithm (algebraic code excited linear prediction). The codec provides substantial quality improvement compared to the existing GSM full rate and half rate codecs. The old GSM codecs lack wireline quality even in error-free channel conditions, while the EFR codec provides wireline quality not only for error-free conditions but also for the most typical error conditions. With the EFR codec, wireline quality is also sustained in the presence of background noise and in tandem connections (mobile to mobile calls).
Journal of the Acoustical Society of America | 1996
Pekka Kapanen; Kari Jarvinen
Disclosed herein are methods and apparatus for improving the quality of synthesized speech that is transmitted through a channel that is susceptible to transmission errors. In a presently preferred embodiment of the invention a speech signal is assumed to be first encoded using a Linear Predictive Coding (LPC) technique prior to transmission. The parameters that describe the short-term spectral behavior of the speech signal are received and then applied to and processed by a non-linear median processing block only on an occurrence of a predetermined number of transmission errors in the received LPC speech signal. The median-processed short term speech parameters are subsequently employed, together with a received excitation signal, in a synthesis filter to synthesize a speech signal of improved quality over what would be obtained if the short term speech parameters were not median processed to compensate for the transmission errors.
international conference on multimodal interfaces | 2002
Jan Kleindienst; Ladislav Seredi; Pekka Kapanen; Janne Bergman
This paper takes a closer look at the user interface issues in our research multi-modal browser architecture. The browser framework, also briefly introduced in this paper, reuses single-modal browser technologies available for VoiceXML, WML, and HTML browsing. User interface actions on a particular browser are captured, converted to events, and distributed to the other browsers participating (possibly on different hosts) in the multi-modal framework. We have defined a synchronization protocol, which distributes such events with the help of the central component called the Virtual Proxy. The choice of the architecture and the synchronization primitives have profound consequences on handling certain interesting UI use cases. We particularly address those specified by the W3C MultiModal Requirements, which are related to the design of possible strategies of dealing with simultaneous input, solving input inconsistencies, and defining synchronization points. The proposed approaches are illustrated by examples.
Universal Access in The Information Society | 2003
Jan Kleindienst; Ladislav Seredi; Pekka Kapanen; Janne Bergman
Contemplating the concept of universal-access multi-modal browsing comes as one of the emerging “killer” technologies that promises broader and more flexible access to information, faster task completion, and advanced user experience. Inheriting the best from GUI and speech, based on the circumstances, hardware capabilities, and environment, multi-modality’s great advantage is to provide application developers with a scalable blend of input and output channels that may accommodate any user, device, and platform. This article describes a flexible multi-modal browser architecture, named Ferda the Ant, which reuses uni-modal browser technologies available for VoiceXML, WML, and HTML browsing. A central component, the Virtual Proxy, acts as a synchronization coordinator. This browser architecture can be implemented in either a single client configuration, or by distributing the browser components across the network. We have defined and implemented a synchronization protocol to communicate the changes occurring in the context of a component browser to the other browsers participating in the multi-modal browser framework. Browser wrappers implement the required synchronization protocol functionality at each of the component browsers. The component browsers comply with existing content authoring standards, and we have designed a set of markup-level authoring conventions that facilitate maintaining the browser synchronization .
Archive | 1997
Kari Jarvinen; Pekka Kapanen; Vesa Ruoppila; Jani Rotola-Pukkila
Archive | 2003
Janne Bergman; Pekka Kapanen
Archive | 2001
Seppo Alanara; Pekka Kapanen
Archive | 1996
Pekka Kapanen
Archive | 1997
Kari Jarvinen; Pekka Kapanen; Jani Rotola-Pukkila; Vesa Ruoppila
Archive | 1999
Pekka Kapanen; Janne Vainio