Jari Kreku
VTT Technical Research Centre of Finland
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jari Kreku.
Eurasip Journal on Embedded Systems | 2008
Jari Kreku; Mika Hoppari; Tuomo Kestilä; Yang Qu; Juha-Pekka Soininen; Per Andersson; Kari Tiensyrj auml
Future mobile devices will be based on heterogeneous multiprocessing platforms accommodating several stand-alone applications. The network-on-chip communication and device networking combine the design challenges of conventional distributed systems and resource constrained real-time embedded systems. Interoperable design space exploration for both the application and platform development is required. Application designer needs abstract platform models to rapidly check the feasibility of a new feature or application. Platform designer needs abstract application models for defining platform computation and communication capacities. We propose a layered UML application/workload and SystemC platform modelling approach that allow application and platform to be modelled at several levels of abstraction, which enables early performance evaluation of the resulting system. The overall approach has been experimented with a mobile video player case study, while different load extraction methods have been validated by applying them to MPEG-4 encoder, Quake2 3D game, and MP3 decoder case studies previously.
international conference on vlsi design | 2003
Juha-Pekka Soininen; Axel Jantsch; Martti Forsell; Antti Pelkonen; Jari Kreku; Shashi Kumar
Exploitation of silicon capacity will require improvements in design productivity and more scalable system paradigms. Asynchronous message passing networks on chip (NOC) have been proposed as backbones for billion-transistor ASICs. We present a novel layered backbone-platform-system (BPS) design methodology for development of network-on-chip based products. It combines and extends the distributed, parallel, embedded and platform-based design concepts in order to manage the diversity and complexity of NOC-based systems. The reuse of communication principles in various platforms, the reuse of platforms in product differentiation, and system-level decision-support methods are the cornerstones of our methodology. The presented mappability estimation and workload simulations demonstrate the feasibility of such methods.
design, automation, and test in europe | 2010
Jari Kreku; Kari Tiensyrjä; Geert Vanmeerbeeck
Future embedded system products, e.g. smart hand-held mobile terminals, will accommodate a large number of applications that will partly run sequentially and independently, partly concurrently and interacting on massively parallel computing platforms. Already for systems of moderate complexity, the design space will be huge and its exploration requires that the system architect is able to quickly evaluate the performances of candidate architectures and application mappings. The mainstream evaluation technique today is the system-level performance simulation of the applications and platforms using abstracted workload and processing capacity models, respectively. These virtual system models allow fast simulation of large systems at an early phase of development with reasonable modeling effort and time. The accuracy of the performance results is dependent on how closely the models used reflect the actual system. This paper presents a compiler based technique for automatic generation of workload models for performance simulation, while exploiting an overall approach and platform performance capacity models developed previously. The resulting workload models are experimented using x264 video and JPEG encoding application examples.
forum on specification and design languages | 2008
Jari Kreku; Mika Hoppari; Tuomo Kestilä; Yang Qu; Juha-Pekka Soininen; Kari Tiensyrjä
Increasing number of concurrent applications in future mobile devices will be based on parallel heterogeneous multiprocessor system-on-chip platforms using network-on-chip communication to achieve scalability. In this paper we describe a performance modeling and simulation approach to explore efficiently the application-platform solution/design space at system-level. The application behavior is abstracted to workload models that are mapped onto performance models of the execution platform for transaction level simulation. The approach provides separation of application and platform through service-oriented modeling. The experimentation of the approach in virtual network computing and mobile video player case studies is presented.
conference on design and architectures for signal and image processing | 2011
Jukka Saastamoinen; Jari Kreku
As most of the applications of embedded system products are realized in software, the performance estimation of software is crucial for successful system design. Significant part of the functionality of these applications is based on services provided by the underlying software libraries. Often used performance evaluation technique today is the system-level performance simulation of the applications and platforms using abstracted workload and execution platform models. The accuracy of the software performance results is dependent on how closely the application workload model reflects actual software as a whole. This paper presents a methodology which combines compiler based user code workload model generation with workload extraction of pre-compiled libraries, while exploiting an overall approach and execution platform model developed previously. Benefit of the proposed methodology compared to earlier solution is experimented using a set of benchmarks.
Archive | 2012
Jari Kreku; Kari Tiensyrjä; Andreas Wieferink; Bart Vanthournout
ASIP exploration uses the mappability method for the selection of processor core and algorithm combinations for multi-core designs. The mappability estimation is based on the analysis of the correlations of algorithm and core characteristics. This information is used for narrowing the exploration space of the subsequent ASIP design that exploits commercial ASIP design environment, Synopsys Processor Designer. According to simulation results the proposed ASIPs are able to achieve up to 96% of maximum performance with a clear reduction in complexity.
Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627) | 2002
Juha-Pekka Soininen; Jari Kreku; Yang Qu; Martti Forsell
Mappability metric and a novel method for evaluating the goodness of processor core and algorithm combinations are introduced. The new mappability concept is an addition to performance and cost metrics used in existing codesign and system synthesis approaches. The mappability estimation is based on the analysis of the correlation or similarity of algorithm and core architecture characteristics. It allows fast design space exploration of core architectures and mappings with little modeling effort. The method is demonstrated by analyzing suitable processor core architectures for baseband algorithms of the WLAN modem. 140400 architecture-algorithm pairs were analyzed in total and the estimated results were similar to the results of more detailed evaluations. The method is not, however, limited to the WLAN modem, but is applicable for digital signal processing in general.
ieee computer society annual symposium on vlsi | 2010
Bernard Candaele; Sylvain Aguirre; Michel Sarlotte; Iraklis Anagnostopoulos; Sotirios Xydis; Alexandros Bartzas; Dimitris Bekiaris; Dimitrios Soudris; Zhonghai Lu; Xiaowen Chen; Jean-Michel Chabloz; Ahmed Hemani; Axel Jantsch; Geert Vanmeerbeeck; Jari Kreku; Kari Tiensyrjä; Fragkiskos Ieromnimon; Dimitrios Kritharidis; Andreas Wiefrink; Bart Vanthournout; Philippe Martin
The project will address two main challenges of prevailing architectures: 1) The global interconnect and memory bottleneck due to a single, globally shared memory with high access times and power consumption, 2) The difficulties in programming heterogeneous, multi-core platforms, in particular in dynamically managing data structures in distributed memory. MOSART aims to overcome these through a multi-core architecture with distributed memory organisation, a Network-on-Chip (NoC) communication backbone and configurable processing cores that are scaled, optimised and customised together to achieve diverse energy, performance, cost and size requirements of different classes of applications. MOSART achieves this by: A) Providing platform support for management of abstract data structures including middleware services and a run-time data manager for NoC based communication infrastructure, 2) Developing tool support for parallelizing and mapping application son the multi-core target platform and customizing the processing cores for the application.
design, automation, and test in europe | 2002
Juha-Pekka Soininen; Jari Kreku; Yang Qu
Method for the selection of processor core and algorithm combinations for system on chip designs is presented. The method uses a mappability concept that is an addition to performance and cost metrics used in codesign. The mappability estimation is based on the analysis of the correlations of algorithm and core characteristics. The method is demonstrated with an analysis tool and the experimental results with DSP cores and algorithms are similar to expectations.
Journal of Systems Architecture | 2013
Janne Vatjus-Anttila; Jari Kreku; Juha Korpi; Subayal Khan; Jukka Saastamoinen; Kari Tiensyrjä
Future interactive embedded systems will support a large number of applications providing users with services related to e.g. telecommunication, audio and video, digital television, internet and navigation. To accommodate these performance demanding applications, the digital processing architectures will evolve from current system-on-chips to massively parallel computers consisting of heterogeneous subsystems connected by a network-on-chip. More flexibility, scalability and modularity are needed from the embedded devices. Consequently, the complexity of system design will increase by orders of magnitude. New methods and tools are needed for the performance evaluation of future embedded systems due to the increasing system complexity. This paper presents a high-level performance modelling and simulation approach called ABSOLUT that alleviates exploration complexity by using abstract virtual system models. The characteristics of the applications are abstracted to workload models that at the bottom level consist of instruction-like primitives. The workload models can be created from application specifications, measurement results, execution traces or source code. The complexity of the execution platform models is reduced since the processing elements need not be modelled in detail and data transfers and storage are simulated only from the performance point of view. The approach enables early evaluation, since the modelling and simulation of complete systems does not require mature hardware or software to exist. ABSOLUT has been applied to a number of case studies including mobile phone usage, MP3 playback, MPEG4 encoding and decoding, 3D gaming, virtual network computing and parallel software defined radio applications. The platforms modelled are either existing or future designs for both embedded systems and personal computers. In several cases, the results obtained from simulations are compared to measurements from real platforms, which reveal an average difference of 12% in the results. This exceeds the accuracy requirements expected from virtual system based simulation approaches intended for early evaluation. In this paper, the most recent enhancements of the ABSOLUT methodology and tool framework are applied in a FFMPEG case study on OMAP4 platform model. The simulation results are compared with those obtained from the execution on an OMAP4-based PandaBoard.