Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kalle Raiskila is active.

Publication


Featured researches published by Kalle Raiskila.


International Journal of Parallel Programming | 2015

pocl: A Performance-Portable OpenCL Implementation

Pekka Jääskeläinen; Carlos S. de La Lama; Kalle Raiskila; Jarmo Takala; Heikki Berg

OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand.


international conference on embedded computer systems architectures modeling and simulation | 2015

Power optimizations for transport triggered SIMD processors

Joonas Multanen; Timo Viitanen; Henry Linjamäki; Heikki Kultala; Pekka Jääskeläinen; Jarmo Takala; Lauri Koskinen; Jesse Simonsson; Heikki Berg; Kalle Raiskila; Tommi Zetterman

Power consumption in modern processor design is a key aspect. Optimizing the processor for power leads to direct savings in battery energy consumption in case of mobile devices. At the same time, many mobile applications demand high computational performance. In case of large scale computing, low power compute devices help in thermal design and in reducing the electricity bill. This paper presents a case study of a customized low power vector processor design that was synthesized on a 28 nm process technology. The processor has a programmer exposed datapath based on the transport triggered architecture programming model. The papers focus is on the RTL and microarchitecture level power optimizations applied to the design. Using semiautomated interconnection network and register file optimization algorithm, up to 27% of power savings were achieved. Using this as a baseline and applying register file datapath gating, register file banking and enabling clock gating of individual pipeline stages in pipelined function units, up to 26% of power and energy savings could be achieved with only a 3% area overhead. On top of this, for the measured radio applications, the exposed datapath architecture helped to achieve approximately 18% power improvement in comparison to a VLIW-like architecture by utilizing optimizations unique to transport triggered architectures.


Archive | 2009

Co-existence between radio access units

Tommi Zetterman; Antti-Veikko Piipponen; Kalle Raiskila


Archive | 2006

Controlling a mobile device

Antti Piipponen; Kalle Raiskila; Tommi Zetterman


Archive | 2009

Wireless resource sharing framework

Tommi Zetterman; Kalle Raiskila


Archive | 2010

METHOD AND APPARATUS FOR PROVIDING PORTABILITY OF PARTIALLY ACCELERATED SIGNAL PROCESSING APPLICATIONS

Heikki Berg; Harri Hirvola; Tommi Zetterman; Kalle Raiskila


Archive | 2008

MULTI-RADIO SCHEDULING AND RESOURCE SHARING ON A SOFTWARE DEFINED RADIO COMPUTING PLATFORM

Kees van Berkel; David Van Kampen; Pjotr Kourzanov; Orlando Moreira; Kalle Raiskila; Tommi Zetterman


Archive | 2009

radio frequency apparatus

Antti Piippponen; Aarno Pärssinen; Konsta Sievanen; Tommi Zetterman; Kalle Raiskila


Archive | 2008

Multiple radio instances using software defined radio

Antti-Veikko Piipponen; Kalle Raiskila; Pasi Rinne-rahkola; Tommi Zetterman; Heikki Berg


Archive | 2009

A radio frequency apparatus

Antti Piipponen; Aarno Pärssinen; Konsta Sievanen; Tommi Zetterman; Kalle Raiskila

Collaboration


Dive into the Kalle Raiskila's collaboration.

Top Co-Authors

Avatar

Jarmo Takala

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar

Pekka Jääskeläinen

Tampere University of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge